Theses from University of Jyväskylä

Dataset contains TF-IDF data matrices targeted for machine learning use. Matrices are generated from document corpora based on 7400 Master's and doctoral theses published in the years 2010 to 2017, collected from the University of Jyväskylä digital repository. There are corpora in Finnish, Swedish and English.

Data resources

Additional Info

Collection Open Data
Maintainer CSC – IT Center For Science Ltd.
Maintainer email
  1. analytics@csc.fi
Links to additional information
  1. https://www.avoindata.fi/data/fi/dataset/finna-koulutuskorpukset
Geographical coverage
Update frequency
Last modified 26.02.2021
Show change log
Created on 24.02.2021