GeoGPT is an open-source, non-profit Large Language Model (LLM), developed for the geosciences. GeoGPT was globally released on 27th April 2025 to widespread acclaim, and already has more than 22,000 registered users. A preliminary analysis of these users showed that many are from Africa and Asia, including geophysicists, remote sensing specialists, disaster risk management scientists, minerals and hydrocarbons explorationists, and data geoscientists.
The development of GeoGPT was inspired and supported by DDE in 2024 by facilitating presentation of its early versions at various national (Japan Geoscience Union), regional (EGU 2024, GSA 2024, AGU 2024) and global (IGC 37) events to obtain feedback from potential users. During the latter part of 2024, GeoGPT transitioned through several trials with the help of registered users to improve its performance, and reached its current stage when the Executive and Governing Committees for GeoGPT decided to open it to any and all interested users.
DDE and Zhejiang Lab, the owner of GeoGPT, have recently signed an MoU defining the roles and responsibilities of the two parties. Like the DDE Platform, GeoGPT is also committed to promoting the implementation of the UNESCO Open Science Recommendation adopted by 193 Member States in 2021. Access to, and use of GeoGPT is open to all, free of charge.
Like DDE, GeoGPT is the brainchild of an international group of scientists, engineers and technologists many of whom have held senior positions and have seen the decline of geoscience behind more numerate sciences that have gained greatly from artificial intelligence and machine learning. Both GeoGPT and DDE aim at liberating geoscience at least to some degree from ‘the long tail’ of data.
The long tail in geoscience was first discussed in detail by Sinha et al. in a paper in 2013. They indicated that geoscience broadly has two kinds of data: those generated by sensor technologies - and more variably generated data. Sensor technology data (earthquake and volcanology) usually reside in well-designed and curated data centres. By contrast, the many small datasets of the deep-time long tail (e.g. stratigraphy, palaeontology, palaeogeography, petrology, geochemistry, tectonics) are more unstructured and more heterogeneous.
The long tail sadly doesn’t just hold back geoscience, but also many other observational and descriptive sciences like biology and zoology. A recent Royal Society report showed rare disease data is like data in the geoscience long tail, in that it’s limited in its availability, and hard to apply to disease research. This holds back progress.
GeoGPT has endeavoured to develop new data standards and methods. The best known LLMs, like OpenAI’s GPT series of models (e.g., GPT-3.5 and GPT-4, used in ChatGPT and Microsoft Copilot), and Google’s PaLM and Gemini were trained on very large datasets and text from across the internet. But until now, there has been no geoscience-specific LLM or GPT. So, the DDE and the Zhejiang Laboratory embarked, in early 2024, on an exploratory research project to create a system trained on open-access geoscience data. GeoGPT is now the most advanced multimodal AI assistant for the geosciences: it can extract key information from geoscience documents, develop computer code, and draw charts and graphs from text. GeoGPT provides Retrieval-Augmented Generation (RAG) so that sources of answers can be traced to single articles and papers. A user can now also choose between a Chinese (Qwen, DeepSeek), French (Mixtral) or American (LlaMa) base model to compare the results – an option that is not commonly offered by LLMs.
Future collaboration between DDE and Zhejiang Lab includes attempts to adapt GeoGPT to aid disciplinary or problem oriented – for example, palynology taxonomies – data sets. Unlike the more commonly used LLMs in ChatGPT etc., such discipline or problem-focused analysis of information would have to depend on much smaller data sets. The UK Royal Society report listed AI and small databases as a research challenge; hopefully, the DDE-Zhejiang lab collaboration could yield results to enable GeoGPT to address parts of that challenge!
The emergence of platforms like DDE and GeoGPT indicate the urge amongst geoscientists to move towards open science and better access to data and tools. It may change the world geoscience research, education and applications to contemporary global challenges such as energy transition and access to and use of critical minerals and rare earths.
Mike Stephenson and Ish Natarajan