Data Integration and Management Component (DIM)

Goal

The DIM component represents one of the core elements of the S2CP platform, enabling various aspects related to data through a collection of developed microservices and API exposition. S2CP includes the following services as part of DIM, designed for use by different CRFS actors:

  1. Integration and Analysis Service:
    The integration and analysis service manages data acquisition from various resources, converting it into the CITIES2030 data model format. Data intended for analysis often follows different data models, and the integration service creates a centralized database view to consolidate data from all sources in the data warehouse cloud location.

The collated data passes through two modular components to achieve the underlying objectives and provide insights and business intelligence to different actors:

    • Data Processing: Data processing handles the transformations required on the data before it enters the pipeline for analytical models. This process involves methods such as data clustering, featurization, and feature engineering.
    • Data Analysis: The data analytics pipeline applies techniques like federated and ensemble learning to address the heterogeneity of data and meet the diverse objectives of the involved actors.
  1. Data Governance Service:
    The data governance service ensures the security and privacy of the platform’s data. This service includes the following features:
    • Data Discovery: This component enables the discovery of various data formats and models from different data sources by the required components.
    • Data Value Exchange Negotiation: This component facilitates data exchange and interoperability, ensuring that data is exchanged in the specified format to achieve the underlying objectives.
    • Multi-Party Incentivized Data Sharing: This component incentivizes data exchange among different actors to achieve common goals, showcasing the positive impacts of data sharing through insights generated by analytical models.
    • Integration with Smart Contracts: This component connects with the underlying blockchain platform to provide smart contract functionality for data exchange.

An infographic representation of these components is provided in Figure 1.

DM S2CP architecture
Figure 1 - Modular representation of Data Integration and Management Component in S2CP Platform

Tools and Technologies

·         Programming: Python – Scripting and Development, SciKit Learn, Numpy, Scipy, Matplotlib, Pandas

·         Database and APIs: MySQL, CouchDB, Flask, JSON, WSGI (e.g., Gunicorn)

Use-case example: Reducing Food Wastage – Sustainable an Efficient Food Supply Chain

An AI-enabled small camera performs real-time calculations at the edge. An AI model is trained to determine the volumetric detection of produce in a display bin. This system determines the amount of produce available in real time in display bins. When a shopper takes produce, the AI model automatically updates to reflect the reduced quantity in the bin. Similarly, if the produce is restocked or the shopper returns the selected item, the model updates automatically.

This system provides two primary outputs:

  1. It alerts store personnel in case of an outage or depletion of produce in the display bin, enabling them to re-stock promptly.
  2. By capturing time-series data snapshots of the produce over time, store personnel gain deep analytical insights into metrics such as hours to sell, days to sell, replenishment rates, restocking rates, and purchasing rates. These insights help retailers optimize their supply chain.

When such systems operate in parallel at different geographical locations within a city (or extend to national and international levels), they update and exchange parametric information through federated learning. This process further enhances the efficiency of the supply chain at both regional and global levels, serving as a part of a decision support system.