Improve upon essay scoring algorithms to improve student learning outcomes.
Overview
This repository contains Jupyter notebooks that demonstrate the process of building and evaluating machine learning models for automated essay scoring.
The goal is to create reliable and efficient models that can provide timely feedback to students and support educators, especially in underserved communities.
Approaches
LightGBM + TF-IDF
This approach involves using LightGBM (Light Gradient Boosting Machine) with TF-IDF (Term Frequency-Inverse Document Frequency) for feature extraction to classify essay scores.
LightGBM is a gradient boosting framework that uses tree-based learning algorithms. It is designed to be distributed and efficient, with the capability to handle large-scale data with high performance and speed. In this project, LightGBM is used to build a model that can accurately predict essay scores based on features extracted from the text data.
TF-IDF is a statistical measure used to evaluate the importance of a word in a document relative to a collection of documents. It is a numerical representation of text that transforms the raw text into features that can be used by machine learning algorithms. By applying TF-IDF, the textual data from essays is converted into a format that LightGBM can process, capturing the relevance and significance of each term within the essays.
Combining LightGBM with TF-IDF allows for an effective classification model that can process and analyze the text data to predict scores accurately. This method leverages the strengths of LightGBM in handling large datasets and the ability of TF-IDF to highlight important textual features, resulting in a robust automated essay scoring system.
RAPIDS SVR
RAPIDS SVR (Support Vector Regression) is utilized for regression tasks on large datasets, leveraging GPU acceleration for faster computation. RAPIDS is an open-source suite of software libraries and APIs built on CUDA, which enables execution on NVIDIA GPUs.
SVR is a type of Support Vector Machine (SVM) used for regression challenges. It seeks to find a function that deviates from the actual observed values by a value no greater than a specified margin and at the same time is as flat as possible. SVR is particularly effective in cases where the relationship between data points is complex and non-linear.
Using RAPIDS SVR in this project allows for efficient handling and processing of large-scale essay datasets, enabling faster training and prediction times due to the parallel computing capabilities of GPUs. This makes it feasible to work with extensive data, ensuring that the regression model can provide precise scoring predictions in a timely manner.
Runs the app in the development mode.
Open http://localhost:3000 to view it in your browser.
The page will reload when you make changes.
You may also see any lint errors in the console.
npm test
Launches the test runner in the interactive watch mode.
See the section about running tests for more information.
npm run build
Builds the app for production to the build folder.
It correctly bundles React in production mode and optimizes the build for the best performance.
The build is minified and the filenames include the hashes.
Your app is ready to be deployed!
See the section about deployment for more information.
npm run eject
Note: this is a one-way operation. Once you eject, you can’t go back!
If you aren’t satisfied with the build tool and configuration choices, you can eject at any time. This command will remove the single build dependency from your project.
Instead, it will copy all the configuration files and the transitive dependencies (webpack, Babel, ESLint, etc) right into your project so you have full control over them. All of the commands except eject will still work, but they will point to the copied scripts so you can tweak them. At this point you’re on your own.
You don’t have to ever use eject. The curated feature set is suitable for small and middle deployments, and you shouldn’t feel obligated to use this feature. However we understand that this tool wouldn’t be useful if you couldn’t customize it when you are ready for it.
This package contains two extensions that add support for frontmatter syntax
as often used in markdown to mdast.
These extensions plug into
mdast-util-from-markdown (to support parsing
frontmatter in markdown into a syntax tree) and
mdast-util-to-markdown (to support serializing
frontmatter in syntax trees to markdown).
Frontmatter is a metadata format in front of the content.
It’s typically written in YAML and is often used with markdown.
Frontmatter does not work everywhere so it makes markdown less portable.
These extensions follow how GitHub handles frontmatter.
GitHub only supports YAML frontmatter, but these extensions also support
different flavors (such as TOML).
When to use this
You can use these extensions when you are working with
mdast-util-from-markdown and mdast-util-to-markdown already.
The YAML node type is supported in @types/mdast by default.
To add other node types, register them by adding them to
FrontmatterContentMap:
importtype{Literal}from'mdast'interfaceTomlextendsLiteral{type: 'toml'}declare module 'mdast'{interfaceFrontmatterContentMap{// Allow using TOML nodes defined by `mdast-util-frontmatter`.toml: Toml}}
Compatibility
Projects maintained by the unified collective are compatible with maintained
versions of Node.js.
When we cut a new major release, we drop support for unmaintained versions of
Node.
This means we try to keep the current release line,
mdast-util-frontmatter@^2, compatible with Node.js 16.
This utility works with mdast-util-from-markdown version 2+ and
mdast-util-to-markdown version 2+.
This package contains two extensions that add support for frontmatter syntax
as often used in markdown to mdast.
These extensions plug into
mdast-util-from-markdown (to support parsing
frontmatter in markdown into a syntax tree) and
mdast-util-to-markdown (to support serializing
frontmatter in syntax trees to markdown).
Frontmatter is a metadata format in front of the content.
It’s typically written in YAML and is often used with markdown.
Frontmatter does not work everywhere so it makes markdown less portable.
These extensions follow how GitHub handles frontmatter.
GitHub only supports YAML frontmatter, but these extensions also support
different flavors (such as TOML).
When to use this
You can use these extensions when you are working with
mdast-util-from-markdown and mdast-util-to-markdown already.
The YAML node type is supported in @types/mdast by default.
To add other node types, register them by adding them to
FrontmatterContentMap:
importtype{Literal}from'mdast'interfaceTomlextendsLiteral{type: 'toml'}declare module 'mdast'{interfaceFrontmatterContentMap{// Allow using TOML nodes defined by `mdast-util-frontmatter`.toml: Toml}}
Compatibility
Projects maintained by the unified collective are compatible with maintained
versions of Node.js.
When we cut a new major release, we drop support for unmaintained versions of
Node.
This means we try to keep the current release line,
mdast-util-frontmatter@^2, compatible with Node.js 16.
This utility works with mdast-util-from-markdown version 2+ and
mdast-util-to-markdown version 2+.
I bought 500 Twitter followers from one of the leading providers for this kind of service. What I got can be found in fake_accounts.csv: Users with mostly all having ~1500 Tweets (most of them are retweets) about Trump, Porn, Saudi Arabia. 1/10 would not buy again.
born&raised in Maine {McAuley girl for Life}& wicked New England Sports fan. now in S.NJ married to a Philadelphia crazed sports fan; fam,friends&music = love