The LAK Dataset makes publicly available machine-readable versions of research sources from the Learning Analytics and Educational Data Mining communities, where the main goal is to facilitate research, analysis and smart explorative applications.
This site provides access to structured fulltext and metadata from key research publications in the field. This advances SoLAR’s mission, as it provides not only more comprehensive search facilities to discover relevant work in the growing corpus, but also enables researchers to analyse the field — for instance, to track the evolution of a topic over time, or to identify correlations with related communities. The ACM International Conference on Learning Analytics and Knowledge (LAK) sponsored by SoLAR is the field’s premier research forum, providing common ground for academics, administrators, software developers and companies to shape and debate the state of the art in learning analytics and related fields. The ACM conditions of providing the full text of the LAK Conference Proceedings specify:
- ACM is providing this ACM Digital Library data solely for research purposes, gratis. Should software that is beneficial to the users of the ACM Digital Library be developed using this data, whenever feasible, ACM would appreciate an as-is perpetual royalty-free license to that software to be used by ACM solely in the context of ACM’s Digital Library services to benefit the Computer Science community.
- Proceedings of the ACM International Conference on Learning Analytics and Knowledge (LAK) (2011-14)
- Proceedings of the LAK Data Challenge (2013-14)
- Proceedings of the International Conference on Educational Data Mining (2008-14)
- The open access journal Educational Technology & Society recently published a 2012 special issue on “Learning and Knowledge Analytics”: Educational Technology & Society (Special Issue on Learning & Knowledge Analytics, edited by George Siemens & Dragan Gašević), 2012, 15, (3), pp. 1-163.
- Journal of Educational Data Mining (2009-14)
- Journal of Learning Analytics (2014)
Metadata has been extracted to create a corpus with the full text, and metadata including authors, affiliations, titles, keywords and abstracts. The schema used to describe the papers in the dataset is based on two established schemas: the Semantic Web Conference Ontology (already used to describe metadata about publications from the Semantic Web conferences and related events) and the Linked Education schema. The data is accessible in various forms:
- Zipped dataset dump file for download [RDF] [NT]
- R format (thanks Adam Cooper) hosted on the KMi Crunch R server [LAK+EDM]
- Using semantic web infrastructure, a public SPARQL endpoint provides access to structured RDF metadata according to LOD principles. The endpoint is available via http://lak.linkededucation.org/request/lak-conference/sparql?query=[your sparql query] View some example queries.
Explore the dataset
People and Organisations
- Stefan Dietze (L3S Research Center, Germany)
- Davide Taibi (Institute for Educational Technologies CNR, Italy)
- Simon Buckingham Shum (SoLAR)
Since the LAK Dataset is the result of several individuals and organisations, we would like to ask all user of the dataset to include the following acknowledgements in your papers referring to the LAK Dataset:
- “We gratefully acknowledge the publishers who have contributed to the LAK Dataset: ACM, International Educational Data Mining Society and Journal of Educational Technology & Society.”
- Please also include a reference to the following publication: Taibi, D. and Dietze, S. (2013), Fostering Analytics on Learning Analytics Research: the LAK Dataset. In: CEUR WS Proceedings Vol. 974, Proceedings of the LAK Data Challenge, held at LAK2013 – 3rd International Conference on Learning Analytics and Knowledge (Leuven, BE, April 2013). (PDF online)