Nelson Jovel
289b04bc69
match an additional format for Dates. Supported dates are now '1/10/2022 14:21:45' '2022-1-10T14:21:45' '2022-1-10 14:21:45'
2 years ago
Nelson Jovel
d6735d449d
feat: Support two date formats: ISO 8601 and the standard US date format
...
used in google sheets
2 years ago
Nelson Jovel
9696a2b2fa
fix: fix failing test
2 years ago
Nelson Jovel
0a32fb50ff
fix: no longer support 'form' in filename when cleaning. Only look for 'part X' and add that to the filename if it exists
2 years ago
Nelson Jovel
ed07114a91
fix: fix failing tests
2 years ago
Nelson Jovel
880b438eb4
chore: reenable test spec that tests data loader for races
2 years ago
Nelson Jovel
36e21515c3
chore: refactor Race out of survey_item_values
2 years ago
Nelson Jovel
e7fb009425
chore: refactor Gender out out of survey_item_values row
2 years ago
Nelson Jovel
41d942c214
chore: Make sure 'hispanic' column only gets applied when using SIS race information
2 years ago
Nelson Jovel
f028e6c884
feat: if the filename includes the words 'form' or 'part' add that to the resulting cleaned filename
2 years ago
Nelson Jovel
d90a83e510
fix: instead of looking for 'asian' at the start of a word, look for it
...
after a word boundary. This means it still doesn't get confused with
caucasian and it's more flexible whan asian appears inside other text
such as 'Caucasian and Asian and Black'
2 years ago
Nelson Jovel
3f44613085
chore: various fixes for race and gender categorization during cleaning.
...
Also add tests for race and gender categorization
2 years ago
Nelson Jovel
b7e670bb60
Lower threshold for the number of valid student responses from 17 to 11
2 years ago
Nelson Jovel
6e05909423
chore: fix categorization of gender
2 years ago
Nelson Jovel
e325f38c43
Convert gender and race text into qualtrics codes during cleaning. Abide by 'prefer not to disclose' for self reported race. Give priority to self reported data but use SIS information as backup
2 years ago
Nelson Jovel
305ddf2b1a
chore: add test for checking duplicate headers during cleaning process
2 years ago
Nelson Jovel
b63c327d33
chore: when searching for dese id, split up pattern so that to be more explicit about the order in which to search out the columns that might have the dese ID we're looking for.
2 years ago
rebuilt
1a707eb6bc
feat: load student responses in the same pass as loading the survey responses
...
chore: remove student loader since loading students is now done with the survey response loader
2 years ago
rebuilt
e3fbbabce5
feat: We no longer trust the progress number that gets exported from qualtrics. Instead during the cleaning progress, perform a manual count of the number of responses to filter out rows that don't meet the minimum threshold.
2 years ago
rebuilt
83661540b7
chore: upgrade to rails 7.1.
...
upgrade rspec
fix failing tests
upgrade devise
2 years ago
rebuilt
48e795fcfb
feat: add special education disaggregation
2 years ago
rebuilt
060d7aa55a
Add disaggregation by ELL
2 years ago
rebuilt
abea2cb8fa
feat: support multiple columns for race and gender information
2 years ago
rebuilt
714b90b3eb
fix: ensure cleaner outputs columns for all survey items. Before the fix, if a survey item varient (ending in -1, ie s-tint-q1-1) did not have a matching survey item s-tint-q1, the resulting csv would not include that column
2 years ago
rebuilt
a785c69c44
Add Overall Response Rate
2 years ago
rebuilt
4afa030141
chore: remove precalculated race scores. Calculate race scores on every reload
2 years ago
rebuilt
cec48e55d3
chore: remove outdated admin data loader file. We now use Dese::Loader to load school level data
2 years ago
rebuilt
5c7729beeb
feat: if admin data value is above 5, round down to 5
2 years ago
rebuilt
4f035f6a63
feat: Add income table to the database. Add seeder for income. Add a reference to income from survey item response. Update the loader to import income data from the survey response csv. Refactor analyze controller to extract presenter. Add corresponding specs. Add income graph to analyze page
2 years ago
rebuilt
d72f8d31e0
fix: There was an n+1 problem where we looked up the list of schools for
...
every row. Now we query the list of schools just once per file
2 years ago
rebuilt
e8aa75bf66
feat: update survey_item_response table to indlude recorded date and import recorded date when loading responses
2 years ago
Nelson Jovel
0a2c5e02c5
feat: add ability to merge disaggregation data with raw survey data to
...
produce a cleaned csv with merged income disaggregation columns
3 years ago
rebuilt
411c632c25
chore: remove errant comment
3 years ago
rebuilt
30285efd69
It's possible for admin data likert score values to be above 5. If that happens, we
...
cap the likert score at 5. This was happening already at the scraper
level but it's also now being done by the admin data loader for safety.
Also make sure to just update admin data instead of deleting and
reloading all values each load. Add tests to confirm this behavior
3 years ago
rebuilt
9aeb5f92af
Missing progress or duration information does not result in a row removed in the cleaning process
3 years ago
rebuilt
e3ae12b425
update response_date to recorded_date
3 years ago
rebuilt
8ef8cfce58
Adjust valid duration threshold of short form items
3 years ago
rebuilt
4509c157fa
Add automated data cleaning. Modify SurveyItemValues class to use regex
...
instead of hard coded values. Produce a clean csv and a csv with all
the removed values and columns with reason for removal. Add script for
running cleaning for each project
3 years ago
rebuilt
3f2a7dff50
Fix problem with dese scraper lumping in 2021-22 data as 2022-23 data.
...
Deleted unused csvs. Turned off puts statements in admin loader.
Remove old, now unused admin data loader class.
3 years ago
rebuilt
128748addd
Update logic for calculating student response rate. Remove references
...
to survey table. We no longer check or keep track of the survey type.
Instead we look in the database to see if a survey item has at least 10
responses. If it does, that survey item was presented to the respondent
and we count it, and all responses when calculating the response rate.
Remove response rate timestamp from caching logic because we no longer
add the response rate to the database. All response rates are calculated
on the fly
Update three_b_two scraper to use teacher only numbers
swap over to using https://profiles.doe.mass.edu/statereport/gradesubjectstaffing.aspx as the source of staffing information
3 years ago
rebuilt
b250ebe415
Memoize schools in SurveyItemValues and academic_years in AcademicYear
...
for performace improvement
3 years ago
rebuilt
c15cb7b483
Change survey data loader spec to use factorybot objects instead of loading seeds. Change databasecleaner to use transaction. Add back babel-preset dependency to fix failing javascript test in production.
3 years ago
rebuilt
fbfaafa996
Update school/framework files for ecp schools. Fix tests. Update fixture files with dese ids of schools in the consortium.
3 years ago
rebuilt
4c4ccc01cc
Merge branch 'rpp-response-rate' to bring in changes to test files
3 years ago
rebuilt
6b31fa9115
Batch imports for staffing data
3 years ago
rebuilt
d059177f0c
load total students and batch importing records
3 years ago
Nelson Jovel
bfa5f28d7b
Convert dese::loader from using seeder to factories
3 years ago
Nelson Jovel
7a7b78a9e0
convert student loader from seeding to factories
3 years ago
rebuilt
2cf2b7d7c1
turn off some slow tests that don't add value
3 years ago
rebuilt
2362d884eb
Convert admin data loader from using seeder to using factory
3 years ago