A machine learning dataset that has something to do with education

I'm going to start the ML project for the class, and I would like to do something like a “recommendation system” for education (ie a system that says the student should do next)

Further development: it would be interesting to build (some) system that can predict the accuracy of students, and try to recommend content that can increase this accuracy (in, say, mathematical problems).

Now I do not have a solid project yet. I do not have the means to collect data right now, so I would like to see the available data in order to develop a project. Any dataset related to education can be helpful (and / or inspirational).

I found some interesting data sets for training ML here in the stack overflow: Prize Netflix data set for analysis with co-filtering algorithms (CF) , Data set for data mining project , but, unfortunately, nothing related to education as far as I could judge

+4
source share
3 answers

UCI is a great data source for machine learning

There is a publicly available dataset for evaluating a teaching assistant that can meet your needs:

http://archive.ics.uci.edu/ml/datasets/Teaching+Assistant+Evaluation

Collector :

Wei-Yin Loh (Department of Statistics, UW-Madison)

Donor :

Tjen-Sien Lim (limt '@' stat.wisc.edu)

Dataset Information :

The data consists of performance evaluations for three regular semesters and two summer semesters of 151 faculty members (TA) in the Department of Statistics at the University of Wisconsin-Madison. Grades were divided into 3 approximately equal (“low”, “medium” and “high”) to form a class variable.

Attribute Information :

  • Regardless of whether TP is a native English speaker (binary); 1 = English speaker, 2 = Non-English speaker
  • Course Instructor (categorical, 25 categories)
  • Course (categorical, 26 categories)
  • Summer or regular semester (binary) 1 = Summer, 2 = Normal
  • Class Size (Numeric)
  • Class attribute (categorical) 1 = low, 2 = medium, 3 = high
+3
source

In the machine learning class we took, we competed with the general tasks at CONLL. There are many different types of tutorials that we created in teams to compete with each other.

Another place to find a dataset is kaggle ( http://www.kaggle.com/competitions ). There are different types of datasets, and they are fun too.

0
source

My choice is https://pslcdatashop.web.cmu.edu/ (a site specialized in educational settings)

In particular, in 2010, they took the KDD Cup, whose task was to predict the accuracy of students based on previous work: https://pslcdatashop.web.cmu.edu/KDDCup/rules_evaluation.jsp

This is a fairly large data set, and you can also see the papers of the people who participated (and it is very useful!)

0
source

Source: https://habr.com/ru/post/1498001/


All Articles