Data Modeling #1 Haystack at Scale in Australia & Data Driven Gap Analysis Leon Wurfel BUENO (Built Environment Optimisation) Tuesday, May 19, 2015
Who is BUENO? A 2-year old Australian-based analytics company We provide analytics driven managed services over 100,000 points on more than 40 sites across Australia (combined area of over 2M m 2 / 21.5M ft 2 ) On these points, we run over 100 analytical rules to identify a range of problems We employ the Haystack tagging standard as far as possible to enable our analytics 2
Deployed points Applying Haystack at-scale When the number of points grows large, we have to be systematic in how the tags are applied and maintained 180,000 160,000 140,000 120,000 100,000 80,000 retail office education hotel health 60,000 40,000 20,000 0 Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec Jan Feb Mar Apr May 2013 2014 2015 3
Overview of total BUENO points All points Point-level tags 0 20,000 40,000 60,000 80,000 100,000 120,000 Deployed points 4
The tagging process Points and histories are imported into our analytics platform. We take as many points as possible from the site, even if their function is unclear or we have no analytics written. The points are then tagged according to the Haystack standard as far as possible. Points for which we have analytics are prioritised. 5
Haystack commissioning processes After the tagging has been completed at a site, we employ a number of commissioning processes to check: Points are tagged correctly and according to Haystack The structure of the site, as implied by Haystack, matches our mental model Our analytics can discover all the points they need in order to run 6
Commissioning Haystack in SkySpark 7
Haystack-to-LaTeX bridge 8
Number of points tagged according to Haystack Haystack tags Rest of points Point-level tags 0 20,000 40,000 60,000 80,000 100,000 120,000 Deployed points 9
Our analytics require some additional tags In some instances, we need to implement non-haystack tags to enable specific analytics Reasons for creating non-haystack tags include: Required tag not yet part of Haystack Site configuration/structure not accounted for in Haystack In some cases tags are used dynamically as data storage to save on computations Over the long term, our tag library converges towards the Haystack standard 10
Points with both Haystack and BUENO tags Bueno + Haystack tags Haystack tags Rest of points Point-level tags 0 20,000 40,000 60,000 80,000 100,000 120,000 Deployed points 11
Non-HVAC building systems Type of equipment or building system not yet part of Haystack: car park and other access control data vertical transportation waste management CCTV 12
Other systems and remainder of points Bueno + Haystack tags Haystack tags Other building information systems Untagged Point-level tags 0 20,000 40,000 60,000 80,000 100,000 120,000 Deployed points 13
GAP ANALYSIS Reasons why points are not tagged: The meaning/function of the point is not clear at the time of tagging. No analytics exist for this type of point, and since tagging resources are finite these points are left untagged. No tags exist for the point/equip type Deployment error 14
GAP ANALYSIS - dictionary attack All of the untagged points are exported from SkySpark A dictionary of potential matches was defined. We then searched within the point names for the dictionary entries Pairings of matches were allowed (two or more matches commonly found together 15
Count of matches GAP ANALYSIS - associations The untagged points can drive analytics development (and potentially require new tags) Demonstrates where deployment may require better processes Highlight the need for better development of a particular building component 3,500 3,000 2,500 2,000 1,500 1,000 500 boiler coolingtower boilerplant fan pump chillerplant chiller meter pac hotelroom unknown ahu vav 0 Dictionary match in point name string 16
Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec Jan Feb Mar Apr May Deployed points Why is it important? 700,000 600,000 retail office education hotel health carpark access control vertical transport supermarket 500,000 400,000 300,000 200,000 100,000 0 2013 2014 2015 2016 17
Final thoughts Haystack has been invaluable in getting this far but its not enough We need to expand the focus to consider all building information systems and their semantic language requirements there is value in all data from all systems We need to move now because the space is growing so fast 18