View on GitHub

JODA - JSON On Demand Analysis

Efficient data wrangling for semi-structured JSON documents

What is JODA?

JODA is an efficient data wrangling tool for semi-structured JSON datasets. It can handle every scale of data, from small-scale to big data. Every system resource is fully utilized to reach the best performance. JODA creates indices adaptively, depending on the workload, to optimize for iterative workloads.

If you are just getting started, check out the following resources:

Latest Release - v0.14.0

The latest release can always be found on GitHub.

0.14.0 Query Execution Redesign (09-02-2023)

In this release the query execution pipeline has been completely redesigned. This allowed us to extend JODA with many more exciting features, but may have positive and/or negative impacts on the performance, depending on the query.

The most important changes are:

Breaking Changes

Added

Citation

If you use this project in your research, please cite it using our ICDE 2020 demo paper.

Bibtex:


@inproceedings{DBLP:conf/icde/Schafer020,
  author    = {Nico Sch{\"{a}}fer and
               Sebastian Michel},
  title     = {{JODA:} {A} Vertically Scalable, Lightweight {JSON} Processor for
               Big Data Transformations},
  booktitle = {36th {IEEE} International Conference on Data Engineering, {ICDE} 2020,
               Dallas, TX, USA, April 20-24, 2020},
  pages     = {1726--1729},
  publisher = {{IEEE}},
  year      = {2020},
  url       = {https://doi.org/10.1109/ICDE48307.2020.00155},
  doi       = {10.1109/ICDE48307.2020.00155},
  timestamp = {Fri, 05 Jun 2020 17:54:57 +0200},
  biburl    = {https://dblp.org/rec/conf/icde/Schafer020.bib},
  bibsource = {dblp computer science bibliography, https://dblp.org}
}