28  Optional Self study Session

Real-world applications on industrial data face many challenges. One of them being that the data from machines is usually retrieved via communication protocols from programmable logic controllers (PLC). These communication protocols often lack standardization and can vary significantly between different manufacturers and even different models of PLCs. This heterogeneity makes it difficult to integrate data from (multiple) sources and can lead to massive complexity in data processing and analysis.

In the following, you will read a review paper discussing challenges in introducing industrial data science to existing (brownfield) environments.

WarningPredatory Journals

Predatory journals exploit the academic publishing model for profit, often lacking rigorous peer review and editorial oversight. They may present themselves as legitimate outlets but prioritize financial gain over scholarly integrity. Researchers should be cautious when selecting journals for publication, ensuring they choose reputable venues that uphold high academic standards.

Wikipedia (accessed: 09 09 2025) states: MDPI’s business model is based on establishing entirely open access broad-discipline journals, with fast processing times from submission to publication and article processing charges paid by the author, their institutions or funders. MDPI’s business practices have attracted controversy, with critics suggesting it sacrifices editorial and academic rigor in favor of operational speed and business interests. MDPI was included on Jeffrey Beall’s list of predatory open access publishing companies in 2014; it was removed in 2015 following a successful appeal, while applying pressure on Beall’s employer. Some journals published by MDPI have also been noted by the Chinese Academy of Sciences (CAS) and Norwegian Scientific Index for lack of rigor and possible predatory practices, as of 2025, CAS no longer lists any MDPI journals on its Early Warning List. In 2024, Finland’s Public Forum, which classifies publication channels for academic research, downgraded 193 MDPI journals to its lowest, level 0 rating.

PredatoryJournals.org published a blog post discussing MDPI in this regard.

Exercise 28.1 (Data Science on Industrial Data – Today’s Challenges in Brown Field Applications)  

Read Klaeger, Gottschall, and Oehm (2021), published in MDPI. You should be able to discuss the following questions:

  • What is the typical data flow from a machine to a data store?
  • Why is data acquisition harder in brownfield environments than in labs or new machines?
  • Why is ground truth data difficult to obtain in industrial settings and what are possible implications?
  • How does the long lifecycle of industrial machines complicate the adoption of AI technologies?
  • Why does data preparation in industry take even more effort than the often proclaimed 80% of the time needed for data preparation rule?