We call raw data, the data which has been collected from a data source: operating system, application process, a meteorological weather station, or a sensor, for example, without any changes or modification of any kind, numerical or not. Sometimes this set of data is called the primary data.
The raw data can have any format, binary or text, record orientated or not, time series or not. Usually it is found as a simple text orientated format, the CSV format, where data is presented as records, each record having fields comma separated.
Why do i need raw data?
It is important to collect and store raw data from your data sources, somewhere safe and easy to access. This will be your centralised data point, containing all data recorded from your network of sensors (IIoT) or from a data-centre, or from a suite of applications or services. From this centralised point you can easily inspect, browse and conduct any type of data analysis or visualization you like without being restricted to a particular software application. In Kronometrix we use our data analytics module as our centralised data point.
You will need access to raw data, because:
- it is the primary data set, before any changes are applied by humans or machines
- charts, summaries, aggregations can be misleading and as data become more complex, it is important to access and process it
- it lets you conduct any type of analysis or visualization you like
- if security and privacy matters, it helps ensure nobody has changed the data
- any forecast will require a solid base of it
Who else is collecting and using raw data ?
Any statistical, numerical and visualization data process will require access to it. From financial market, like financial raw data feeds to medical, space, aeronautical, biochemical engineering, all are heavily using raw data sets.
You should ask you vendors, software providers, or use software which can easily give access to their raw data. Ultimately ensure all data recorded in and out of your company is somewhere stored and available as raw data format. This will put order, save money and bring stability to your organization in the long run.