• Tech News
    • Games
    • Pc & Laptop
    • Mobile Tech
    • Ar & Vr
    • Security
  • Startup
    • Fintech
  • Reviews
  • How To
What's Hot

Elementor #32036

January 24, 2025

The Redmi Note 13 is a bigger downgrade compared to the 5G model than you might think

April 18, 2024

Xiaomi Redmi Watch 4 is a budget smartwatch with a premium look and feel

April 16, 2024
Facebook Twitter Instagram
  • Contact
  • Privacy Policy
  • Terms & Conditions
Facebook Twitter Instagram Pinterest VKontakte
Behind The ScreenBehind The Screen
  • Tech News
    1. Games
    2. Pc & Laptop
    3. Mobile Tech
    4. Ar & Vr
    5. Security
    6. View All

    Bring Elden Ring to the table with the upcoming board game adaptation

    September 19, 2022

    ONI: Road to be the Mightiest Oni reveals its opening movie

    September 19, 2022

    GTA 6 images and footage allegedly leak

    September 19, 2022

    Wild west adventure Card Cowboy turns cards into weird and silly stories

    September 18, 2022

    7 Reasons Why You Should Study PHP Programming Language

    October 19, 2022

    Logitech MX Master 3S and MX Keys Combo for Business Gen 2 Review

    October 9, 2022

    Lenovo ThinkPad X1 Carbon Gen10 Review

    September 18, 2022

    Lenovo IdeaPad 5i Chromebook, 16-inch+120Hz

    September 3, 2022

    It’s 2023 and Spotify Still Can’t Say When AirPlay 2 Support Will Arrive

    April 4, 2023

    YouTube adds very convenient iPhone homescreen widgets

    October 15, 2022

    Google finishes iOS 16 Lock Screen widgets rollout w/ Maps

    October 14, 2022

    Is Apple actually turning iMessage into AIM or is this sketchy redesign rumor for laughs?

    October 14, 2022

    MeetKai launches AI-powered metaverse, starting with a billboard in Times Square

    August 10, 2022

    The DeanBeat: RP1 simulates putting 4,000 people together in a single metaverse plaza

    August 10, 2022

    Improving the customer experience with virtual and augmented reality

    August 10, 2022

    Why the metaverse won’t fall to Clubhouse’s fate

    August 10, 2022

    How Apple privacy changes have forced social media marketing to evolve

    October 16, 2022

    Microsoft Patch Tuesday October Fixed 85 Vulnerabilities – Latest Hacking News

    October 16, 2022

    Decentralization and KYC compliance: Critical concepts in sovereign policy

    October 15, 2022

    What Thoma Bravo’s latest acquisition reveals about identity management

    October 14, 2022

    What is a Service Robot? The vision of an intelligent service application is possible.

    November 7, 2022

    Tom Brady just chucked another Microsoft Surface tablet

    September 18, 2022

    The best AIO coolers for your PC in 2022

    September 18, 2022

    YC’s Michael Seibel clarifies some misconceptions about the accelerator • DailyTech

    September 18, 2022
  • Startup
    • Fintech
  • Reviews
  • How To
Behind The ScreenBehind The Screen
Home»Tech News»Unstructured vs semi-structured data: Order from chaos
Tech News

Unstructured vs semi-structured data: Order from chaos

September 3, 2022No Comments5 Mins Read
Facebook Twitter Pinterest LinkedIn Tumblr Email
Unstructured vs semi-structured data: Order from chaos
Share
Facebook Twitter LinkedIn Pinterest Email

Structured vs unstructured data – it’s a common way of categorising things.
But it’s not quite that simple.

Although structured data is easy to grasp, the world of unstructured data and its transformation to more easily understandable, usable and analysable semi-structured data, is less simple.

In this article, we look at structured data, unstructured data, and how semi-structured data brings some order from potential chaos. And brings benefits to organisations that want to gain value from often very large stores of documents, images, sound files, video, social media posts, and so on.

Structured data has… structure

Business information is mostly generated by systems or people. Data from systems is most likely to be structured.

In its traditional format, this is most typified by data in relational databases that use SQL (structured query language). In these, structure is everything. Columns that represent variables are set up in advance and populated by rows of data in which a value sits at the intersection of each.

It’s something we can all visualise. It’s like we see in a spreadsheet – though whether spreadsheets are structured data is up for debate – but complex SQL database schemas involve the equivalent of numerous spreadsheets (tables, in database-speak) that relate (whence “relational”) to each other and can be filtered, joined and manipulated in many ways because they have common elements (keys).

Despite the prevalence of unstructured data and the rise of formats that are better described as semi-structured, structured databases are important and won’t go away soon.

They are easy to use, by everything from large-scale enterprise applications to machine learning tools, but can be limited in how they are accessed and used and can be relatively onerous to maintain and to change once initially configured.

The mass of unstructured data

Unstructured data is often generated by people – although not solely – and includes media such as images and sound recordings, social media posts, agent notes, websites and emails.

Unstructured data holds to no predefined data model and files and objects come in a wide range of sizes, from a few kilobytes for a social media post, for example, to potentially terabytes for uncompressed video footage.

Estimates often suggest that the vast bulk of data is unstructured – up to 80% or 90% of data held by organisations.

If that is the case – and we can safely assume it often is – then this presents huge challenges for organisations. Unstructured data is, to a greater or lesser extent, undefined and opaque to search and classification.

That means organisations may not know what is actually there, and that can be a security and compliance risk. At the same time, it means missing out on opportunities to interrogate that data to gain insights and value from it.

No such thing as unstructured data?

But in fact, it is arguable that no data is truly unstructured. The most unstructured data you can think of – image and sound files, for example – comes with metadata headers that provide high-level information on file contents that can be searched and questioned.

And it is increasingly possible to examine the contents of such files using artificial intelligence/machine learning techniques to, for example, examine and categorise the contents of sound and video files. YouTube does this to ensure copyright on music is not contravened when you upload a video, for instance, so these types of data can be tagged with new metadata-based, algorithm-based interrogation, should an organisation wish to throw compute at it.

The semi-structured data revolution

At the same time, there is a growing trend towards more use of semi-structured ways of holding data. Some forms of semi-structured data have been around for some time, such as CSV and XML. A bit later came JSON. All these brought with them something like a key:value format for representing variables and values.

Later came a wide range of ways of holding and analysing data that were not restricted by predefined structure. Broadly speaking, these can be lumped together as so-called NoSQL databases, but there are a number of types within that catch-all.

They include column store databases like Hadoop and Cassandra, document stores like MongoDB and CouchDB, key value stores like Riak, as well as graph databases, object databases, and so on. The list gets pretty long.

But, what links these is the lack of the predefined structure – schema-on-write – by which SQL is defined. So, with these non-SQL formats, potentially any data in any existing format, ie unstructured, can be provided with a structure – schema-on-read – as data is queried. It is even possible to include sound and video files – the ultimate in unstructured-ability – in things that get called databases, such as with MongoDB (although there are limitations).

The big advantage of being able to put unstructured data into some form of semi-structured format is that it enables a range of use cases to emerge, such as analytics to spot consumer behaviour, market trends, sentiment analysis.

Arguably, analytics on this kind of data gives deeper insight into users. An SQL database might hold name, date of birth, address, etc, but analysing unstructured data – via making it semi-structured – can get closer to what consumers think.

It is also possible to put some structure on the unstructured and make use of it. A photograph of delivered item would be unstructured data, but metadata from the image file could be combined with geo-tracking information from delivery vehicles in a business intelligence tool.

Source link

See also  Drover AI's Alex Nesic on using tech to regulate the scooter market • DailyTech
Chaos data order semistructured Unstructured
Share. Facebook Twitter Pinterest LinkedIn Tumblr Email

Related Posts

Revolut Needs to Get Its House in Order

September 23, 2023

Google Mourns Veteran Engineer Luiz André Barroso Who Invented the Modern Data Center

September 22, 2023

Meta’s News Block Causes Chaos as Canada Burns

August 28, 2023

The Most Popular Digital Abortion Clinics, Ranked by Data Privacy

August 21, 2023
Add A Comment

Comments are closed.

Editors Picks

Upcoming ‘AirPods Pro 2’ Set to Include Heart Rate and Hearing Aid Features

June 26, 2022

2 iPhone Features You Can Use Even If the Phone Is Off and the Battery’s Dead

June 27, 2022

Nicky Clarke Infrared Pro hair straightener review

February 22, 2023

‘Framed’ answer today September 4 – Movies Hints And Clues

September 4, 2022

Subscribe to Updates

Get the latest news and Updates from Behind The Scene about Tech, Startup and more.

Top Post

Elementor #32036

The Redmi Note 13 is a bigger downgrade compared to the 5G model than you might think

Xiaomi Redmi Watch 4 is a budget smartwatch with a premium look and feel

Behind The Screen
Facebook Twitter Instagram Pinterest Vimeo YouTube
  • Contact
  • Privacy Policy
  • Terms & Conditions
© 2025 behindthescreen.uk - All rights reserved.

Type above and press Enter to search. Press Esc to cancel.