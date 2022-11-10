Meta introduces ‘Tulip,’ a binary serialization protocol supporting schema evolution. This simultaneously addresses protocol reliability and other issues and assists us with data schematization. Tulip has multiple legacy formats. Hence, it is used in Meta’s data platform and has seen a considerable increase in performance and efficiency. Meta’s data platform is made up of numerous heterogeneous services, such as warehouse data storage and various real-time systems exchanging large amounts of data and communicating among themselves via service APIs. As the number of AI and machine learning ML-related workloads in Meta’s system increase that use data for training these ML models, it is necessary to continually work on making our data logging systems efficient. The schematization of data plays a huge role in creating a platform for data at Meta’s scale. These systems are designed based on the knowledge that every decision and trade-off impacts reliability, data preprocessing efficiency, performance, and the engineer’s developer experience. Changing serialization formats for the data infrastructure is a big bet but offers benefits in the long run that make the platform evolve over time.

2 DAYS AGO