A Challenge in Open Data
School of Engineering, University of Edinburgh
Thursday, 16th of May, 2024
A little about me:
For the past two decades there has been significant discussion on the reproducibility crisis in science
Reproducibility: Can you get the same answers that I did when you analyze my data?
Replicability: Can you get the same answers that I did when you do my experiment and collect your own data?
FAIR is an acronym for Findable, Accessible, Interoperable, Reusable, which are the principles which should apply to scientific data management and guardianship.
Findable: The first part of making data re-useable is to make the data findable. Detailed and accurate metadata is key
Accessible: Data could be openly available or it could require prior authentication and authorisation
Interoperable: Data needs to be able to be used in different programs or workflows
Reusable: Well defined data is essential as it makes it easier to understand and therefore use, combine and/or extend the dataset
For visualisation purposes and sharing basic simulation data, .vtk
/.vtu
/.vtp
files are very well supported
# vtk DataFile Version 3.1
Lattice Boltzmann data
ASCII
DATASET UNSTRUCTURED_GRID
POINTS 9 INT
0 0 0 1 0 0 2 0 0
0 1 0 1 1 0 2 1 0
0 2 0 1 2 0 2 2 0
CELLS 4 20
4 0 1 3 4
4 1 2 4 5
4 3 4 6 7
4 4 5 7 8
CELL_TYPES 4
8 8 8 8
CELL_DATA 4
SCALARS Scal_1 DOUBLE
LOOKUP_TABLE default
1 2 1 0
SCALARS Scal_2 DOUBLE
LOOKUP_TABLE default
1 3 2 1
For visualisation purposes and sharing basic simulation data, .vtk
/.vtu
/.vtp
files are very well supported
For Data Analytics?
<VTKFile type=”PolyData” ...>
<PolyData>
<Piece NumberOfPoints=”#” NumberOfVerts=”#” NumberOfLines=”#”
NumberOfStrips=”#” NumberOfPolys=”#”>
<PointData>...</PointData>
<CellData>...</CellData>
<Points>...</Points>
<Verts>...</Verts>
<Lines>...</Lines>
<Strips>...</Strips>
<Polys>...</Polys>
</Piece>
</PolyData>
</VTKFile>
<?xml version="1.0"?>
<VTKFile type="PolyData" version="1.0" byte_order="LittleEndian" header_type="UInt64">
<PolyData>
<Piece NumberOfPoints="20" NumberOfVerts="20" NumberOfLines="0" NumberOfStrips="0" NumberOfPolys="0">
<PointData Scalars="1_temp" Vectors="3_force">
<DataArray Name="1_temp" NumberOfComponents="1" type="Float64" format="ascii">
0.8096282113891782 0.3047311952960293 0.6500663143518912 0.8456637563467384 0.3906199913247039 0.0493153326814327
0.0953282361963043 0.4089217432664466 0.9734786110287174 0.7813877948631190 0.0869568546620415 0.7417237568666443
0.5630292772867312 0.8719294300090183 0.8044927228429963 0.3760636376786568 0.1306022635272779 0.2908653129507012
0.3505188438702752 0.2901130193235927
</DataArray>
<DataArray Name="2_pressure" NumberOfComponents="1" type="Float64" format="ascii">
0.5017704290104593 0.3785371246509415 0.0205158449338875 0.9907952905474608 0.1545654933391684 0.8782385634278128
0.6019489853133294 0.7906981230999223 0.2521929126363432 0.9716150255608206 0.0672474944937144 0.4673686197663716
0.0337000095394104 0.5347613958477050 0.1164155276297819 0.9550270647902264 0.5796338522273943 0.3412746271057875
0.8523204100536866 0.1972657806074329
</DataArray>
<DataArray Name="3_force" NumberOfComponents="3" type="Float64" format="ascii">
0.0309346793427824 0.4092653096970628 0.9788367317195323 0.4223113198793231 0.0900433594418957 0.1787326345033293
0.2966289948749271 0.6954539758304935 0.2832782062952210 0.3251789785540518 0.7688481562679232 0.4766407185048503
0.9288534597190189 0.3341192036963977 0.9436476306446026 0.8789762057451738 0.0606108299327002 0.9254951873329387
0.7412746153715273 0.8109872417934331 0.4360667170346966 0.7233454095760591 0.6976210279118584 0.2955977110451682
0.0994604820813603 0.1875580422474142 0.8431228106210912 0.6213582623034001 0.6158399792301240 0.9406265592401716
0.8845641277007388 0.9350856253927768 0.2970236729050861 0.8712474915967705 0.6910999373713885 0.3187032732785566
0.0127516549008907 0.8755324934933010 0.5639212047451645 0.1967190822555136 0.9738387864540907 0.6568564085127088
0.0096719914074663 0.4238716923324118 0.3943384132270378 0.5395962035976353 0.3564427828885958 0.2408352500616302
0.7594055518045913 0.7632195886304822 0.4137798466081193 0.4414728807283862 0.5239588201936661 0.5330972627254891
0.9141526095541878 0.6308069665444429 0.5170567967508454 0.1320175579472910 0.2107269322401846 0.5006082952330471
</DataArray>
</PointData>
<CellData>
</CellData>
<Points>
<DataArray Name="Points" NumberOfComponents="3" type="Float64" format="ascii">
0.4679781489940003 0.0539201795235295 0.8785521137838496 0.6662011241227854 0.6798039006229138 0.5179184112127887
0.7253673934646818 0.5500273845649549 0.9614782388606561 0.4565839967938599 0.5347282488901165 0.3175208696688832
0.1659279070777949 0.0767387959844464 0.7871958705242772 0.7152392271177217 0.6711429183806140 0.1804037800561749
0.0094476341085491 0.2176115805917300 0.6161777555253786 0.6766546248504622 0.4753594409825290 0.0787254199622341
0.2116737146235712 0.9337516804459353 0.9051117007473103 0.5781205664492428 0.1021111481025472 0.0771094947349589
0.3606986784532169 0.2044204606938260 0.9588362141721601 0.5373476404904579 0.0037939495868201 0.5440923148342509
0.0312391309332078 0.3297690748477242 0.8634945621439319 0.0935309527775745 0.9241324203495521 0.2405794370719853
0.5654258596612191 0.7982712785837751 0.4059594735292160 0.4933549421178993 0.0290714379783759 0.2925235663071808
0.4346748309322130 0.2692194427848057 0.7309894801048282 0.3220642954109557 0.9695141104130361 0.9700483721187488
0.0062263950599849 0.3090008720062422 0.8318148406832973 0.0737151178494130 0.8033901608302160 0.4849158872100066
</DataArray>
</Points>
<Verts>
<DataArray Name="connectivity" NumberOfComponents="1" type="Int32" format="ascii">
0 1 2 3 4 5
6 7 8 9 10 11
12 13 14 15 16 17
18 19
</DataArray>
<DataArray Name="offsets" NumberOfComponents="1" type="Int32" format="ascii">
1 2 3 4 5 6
7 8 9 10 11 12
13 14 15 16 17 18
19 20
</DataArray>
</Verts>
<Lines>
<DataArray Name="connectivity" NumberOfComponents="1" type="Float64" format="ascii">
</DataArray>
<DataArray Name="offsets" NumberOfComponents="1" type="Float64" format="ascii">
</DataArray>
</Lines>
<Strips>
<DataArray Name="connectivity" NumberOfComponents="1" type="Float64" format="ascii">
</DataArray>
<DataArray Name="offsets" NumberOfComponents="1" type="Float64" format="ascii">
</DataArray>
</Strips>
<Polys>
<DataArray Name="connectivity" NumberOfComponents="1" type="Float64" format="ascii">
</DataArray>
<DataArray Name="offsets" NumberOfComponents="1" type="Float64" format="ascii">
</DataArray>
</Polys>
</Piece>
</PolyData>
</VTKFile>
<?xml version="1.0"?>
<VTKFile type="PolyData" version="1.0" byte_order="LittleEndian" header_type="UInt64">
<PolyData>
<Piece NumberOfPoints="20" NumberOfVerts="20" NumberOfLines="0" NumberOfStrips="0" NumberOfPolys="0">
<PointData Scalars="1_temp" Vectors="3_force">
<DataArray Name="1_temp" NumberOfComponents="1" type="Float64" format="binary">
oAAAAAAAAABgvDpseejpPw6Ed0W3gNM/kuwL31fN5D8Cfx1wrQ/rP2zR+f3q/9g/EMEd+ts/qT+oNtlobme4P/r4fBrGK9o/vym3nbwm7z/pxw36IAHpPwi+7+7NQrY/XxLNdTO85z+FYpT5VQTiP4DgSYzY5us/lGHPhWe+6T8g20I4bRHYP8ANUTGTt8A/EoCqi4md0j/yfcOW5m7WP5bfiDI2kdI/
</DataArray>
<DataArray Name="2_pressure" NumberOfComponents="1" type="Float64" format="binary">
oAAAAAAAAADdZNbbgA7gP5ymrMbzOdg/oNUpDBsClT9G1D1TmLTvP7ysfVXNyMM/c0WAwoca7D8N/7iEKkPjP8EgdyZmTek/zpYDvu0j0D/tv+JkeBfvPziklr0hN7E/uK9EEl7p3T/gfLGtIEGhP9WISu7DHOE/mJ/qc2jNvT9kNELrlI/uP7QT30pcjOI/1u6XiHHX1T/HmtxzNUbrP0gXKk4BQMk/
</DataArray>
<DataArray Name="3_force" NumberOfComponents="3" type="Float64" format="binary">
4AEAAAAAAAAgpVswV62fPwpRIiBnMdo/QHzbaKFS7z+4L+cOJgfbP9BgBuQUDbc/aLn1Abbgxj9aHwIu+PvSP6kVQrIoQeY/8Eft6Toh0j8oII99u8/UPxBX2HJnmug/xuV6EkiB3j/joAjkKrntP57cNYM1YtU/ZyAShFwy7j/yZzS1kiDsP6BQsPhhCK8/ZGlGFaid7T+kBcyKhbjnP7c7H4Sb8+k/XHAiYITo2z+l6LpFpSXnP0hEfFXpUuY/fIhtqRLr0j+Aqcj9PXa5P5T3xuTmAcg/TyJEsNz66j/nKPa4KuLjP0GbSwv2tOM/+e+13pwZ7j/mD/ZtWU7sP6KxgLA47O0/KPhQlG8C0z9OKmRrQuHrP23jqJ19HeY/5AL3aaJl1D9AVCcmih2KPy9zRLhcBOw/SNh8e6QL4j8ormREFy7JP1axbPWvKe8/+1wXu/cE5T8AEkK26M6DP/4kEby2INs/YHcXL9c82T9Z7O9BX0ThPzq22WP1z9Y/MI1egbDTzj+H2SzfDE3oPy60mnxLbOg/zD87d1572j8uKDN4F0HcP1/XpUlFxOA/cMef/SEP4T9N0zL5vEDtPxS3bBeSL+Q/UwUHsrqL4D+00fCK8+XAP/jSLaEZ+co/ZDsEsPsE4D8=
</DataArray>
</PointData>
<CellData>
</CellData>
<Points>
<DataArray Name="Points" NumberOfComponents="3" type="Float64" format="binary">
4AEAAAAAAACI/0qfWvPdP7DHS/9sm6s/1RCRUhkd7D88ThUFhVHlP/nLGxz0wOU/dffEocmS4D+SfxCuNTbnP3eMkwfTmeE/Nxz3Am7E7j980oYVrDjdP21fp2p+HOE/CJ/BDUNS1D8cSTIrID3FP9htFlsnpbM/MFbuZLUw6T+KLilgPePmPyFgrLYAeuU/EEG1l3gXxz9AlSr8R1mDPyCp7z6y2ss/RyKQabq34z+xb42ZJ6flP/RYNwFKbN4/eAYaYFkntD/gk93QHxjLP/E1QzRL4e0/7fg90Kz27D98bMGz9n/iP9BOqMn0I7o/AAUZpnK9sz/or+rorxXXPyAYqBxzKso/S+DCSMmu7j8zqM+t8zHhPwDzvCx9FG8/NvpGSTRp4T9At9qWJv2fP3AA7b/vGtU/K9MVWb+h6z9A6Vf/pPG3P4tjUid+ku0/5Dkol07Lzj8J2PH49xfiPzZPWzVwi+k/kMyTcT372T+w2G2bIJPfP4CSdS3nxJ0/8E+mw7S40j9A4s9httHbPz5RjS/kOtE/IqbYDERk5z8m9f+Ps5zUPzeBp3RCBu8/0zc54qIK7z+ASXky2YB5P1ry7JerxtM/pP0hKDqe6j/ITmJ0/t6yP+c0Vkhftek/7C04pdwI3z8=
</DataArray>
</Points>
<Verts>
<DataArray Name="connectivity" NumberOfComponents="1" type="Int32" format="binary">
UAAAAAAAAAAAAAAAAQAAAAIAAAADAAAABAAAAAUAAAAGAAAABwAAAAgAAAAJAAAACgAAAAsAAAAMAAAADQAAAA4AAAAPAAAAEAAAABEAAAASAAAAEwAAAA==
</DataArray>
<DataArray Name="offsets" NumberOfComponents="1" type="Int32" format="binary">
UAAAAAAAAAABAAAAAgAAAAMAAAAEAAAABQAAAAYAAAAHAAAACAAAAAkAAAAKAAAACwAAAAwAAAANAAAADgAAAA8AAAAQAAAAEQAAABIAAAATAAAAFAAAAA==
</DataArray>
</Verts>
<Lines>
<DataArray Name="connectivity" NumberOfComponents="1" type="Float64" format="binary">
</DataArray>
<DataArray Name="offsets" NumberOfComponents="1" type="Float64" format="binary">
</DataArray>
</Lines>
<Strips>
<DataArray Name="connectivity" NumberOfComponents="1" type="Float64" format="binary">
</DataArray>
<DataArray Name="offsets" NumberOfComponents="1" type="Float64" format="binary">
</DataArray>
</Strips>
<Polys>
<DataArray Name="connectivity" NumberOfComponents="1" type="Float64" format="binary">
</DataArray>
<DataArray Name="offsets" NumberOfComponents="1" type="Float64" format="binary">
</DataArray>
</Polys>
</Piece>
</PolyData>
</VTKFile>
VELaSSCo was an EU funded project (2014–2016) dealing with end-user visualization of “Big Data” serving the petabyte era
Aimed to provide new visualization methods for large-scale simulations across disciplines (FEM, CFD, DEM, etc.)
Developed the VELaSSCo platform for accessing, visualizing, and querying distributed simulation information stored across multiple servers
Data handling
Visualisation
Solver agnostic analytics and visualisation platform
Requires all data to be provided in a common format
Sent (streamed) to visualisation client (GiD, iFX) when required
Adopts ISO 10303-209, Multidisciplinary analysis and design (AP209) Standards
A machine-readable ascii format that stored particle and contact data in separate data files
Particle Data (.p4p)
TIMESTEP PARTICLES
0.02 11
ID GROUP TYPE VOLUME MASS PX PY PZ VX VY VZ AngVel_X AngVel_Y AngVel_Z
1 1 1 4.18879e-6 0.010472 0.015492 0.016146 0.0008229 0 0 0.19618 0 0 0
5 2 1 4.18879e-6 0.010472 0.016643 0.019136 0.0092912 0 0 0.19618 0 0 0
.......
Note: Angular velocity is shown as optional value here
Particle-Particle Contact Data (.p4c)
TIMESTEP CONTACTS
0.02 6
P1 P2 CX CY CZ FX FY FZ
11 1 0.004 -0.0055 0.0005 0.727312 -0.098406 2.70531
10 7 0.009 -0.0055 0.0005 -0.00396415 0.235619 0.199911
.......
Note: Total Forces
Particle-Geometry Contact Data (.p4w)
TIMESTEP CONTACTS
0.02 4
P1 WALL CX CY CZ FX FY FZ
10 1 -0.198716 -0.0265078 0.087761 -0 -0 0.00993776
11 1 -0.0178762 0.245043 3.74038 -0 -0 0.00993035
.......
Note: Total Forces
Data & Metadata
Hierarchical Data Format version 5 (HDF5), is an open source file format that supports large, complex, heterogeneous data
Any Questions?
Email: J.Morrissey@ed.ac.uk
TUSAIL Community on Zenodo