TikTok owner introduces LEGO language model, OpenAI not happy


TikTok’s parent company, ByteDance, is planning to launch its own LLM. Last month, OpenAI suspended ByteDance’s account for using GPT to build a competitor.

Chinese tech giant ByteDance released a research paper on January 15th, outlining its Large Language Model (LLM) named LEGO. According to scientists behind the research, the model is capable of grasping fine details from multiple modalities across text, video, audio, and images.

Researchers say that they’ve constructed a “diverse and high-quality multimodal training dataset” from multimodal data, including spatial and temporal information. According to them, the model demonstrates precise identification and localization of specific regions in images or moments in videos.

ADVERTISEMENT
ByteDance LLM
LEGO's design includes different encoders and adapters for various types of information like video, image, and audio. Each type of information goes through its own encoder and adapter, producing specific embeddings for each. The figure demonstrates two examples using video and image modalities. Blue boxes represent video as input, while yellow boxes represent image as input. Source: Research paper

ByteDance has been in the midst of controversy for its ties with the Chinese state. A report submitted to the Australian Senate in August 2023 by a quartet of researchers said that ByteDance should be called a “hybrid” state-private entity because it is “intertwined” with the government in Beijing.

The company has faced bans on its social media app TikTok across multiple countries due to data security concerns.

Using OpenAI’s GPT to build its own model

ByteDance faced backlash in December 2023 after it was accused of secretly using OpenAI’s technology to build its own models. According to internal ByteDance documents cited by The Verge, the Chinese company has relied on OpenAI’s API “during nearly every phase of development” of its foundational LLM.

OpenAI halted ByteDance's account due to uncertainties surrounding the usage of GPT’s data. OpenAI explicitly forbids users from creating competing AI models by using the output generated by ChatGPT.

ByteDance denied any wrongdoing. As reported by CNN, the company’s spokesperson claimed that their engineering team uses OpenAI’s GPT, along with other third-party models, to a very limited extent during the evaluation and testing process.

According to the spokesperson, the company uses GPT’s API to power products and features in markets outside China. The company uses its own self-developed AI model to power a ChatGPT-like tool called Doubao, which is available only in China.

ADVERTISEMENT