Gabe Farrell
6fac759ec2
Faster admin data loader + rename School.school_hash
2 years ago
Nelson Jovel
9bfb76db5a
match an additional format for Dates. Supported dates are now '1/10/2022 14:21:45' '2022-1-10T14:21:45' '2022-1-10 14:21:45'
2 years ago
Nelson Jovel
c3cb05701f
feat: Support two date formats: ISO 8601 and the standard US date format
...
used in google sheets
2 years ago
Nelson Jovel
aa7af11a4e
fix: fix failing test
2 years ago
Nelson Jovel
cc8ed48204
fix: no longer support 'form' in filename when cleaning. Only look for 'part X' and add that to the filename if it exists
2 years ago
Nelson Jovel
d907e2742e
fix: fix failing tests
2 years ago
Nelson Jovel
a0c0b1d01d
chore: reenable test spec that tests data loader for races
2 years ago
Nelson Jovel
d4109fda6f
chore: refactor Race out of survey_item_values
2 years ago
Nelson Jovel
3e4ef9cb08
chore: refactor Gender out out of survey_item_values row
2 years ago
Nelson Jovel
6a24d4fa23
chore: Make sure 'hispanic' column only gets applied when using SIS race information
2 years ago
Nelson Jovel
ce43f52bd5
feat: if the filename includes the words 'form' or 'part' add that to the resulting cleaned filename
2 years ago
Nelson Jovel
a15b01a3e1
fix: instead of looking for 'asian' at the start of a word, look for it
...
after a word boundary. This means it still doesn't get confused with
caucasian and it's more flexible whan asian appears inside other text
such as 'Caucasian and Asian and Black'
2 years ago
Nelson Jovel
8a0ba0dbea
chore: various fixes for race and gender categorization during cleaning.
...
Also add tests for race and gender categorization
2 years ago
Nelson Jovel
2ef24caf70
Lower threshold for the number of valid student responses from 17 to 11
2 years ago
Nelson Jovel
b983f1e144
chore: fix categorization of gender
2 years ago
Nelson Jovel
f27a590c5a
Convert gender and race text into qualtrics codes during cleaning. Abide by 'prefer not to disclose' for self reported race. Give priority to self reported data but use SIS information as backup
2 years ago
Nelson Jovel
97ddb09167
chore: add test for checking duplicate headers during cleaning process
2 years ago
Nelson Jovel
6d84204f83
Add race and gender columns to cleaned cvs files when those headers are
...
missing
2 years ago
Nelson Jovel
a3f9e46414
chore: when searching for dese id, split up pattern so that to be more explicit about the order in which to search out the columns that might have the dese ID we're looking for.
2 years ago
rebuilt
019b954ffa
feat: load student responses in the same pass as loading the survey responses
...
chore: remove student loader since loading students is now done with the survey response loader
2 years ago
rebuilt
b2fdbe5756
feat: We no longer trust the progress number that gets exported from qualtrics. Instead during the cleaning progress, perform a manual count of the number of responses to filter out rows that don't meet the minimum threshold.
2 years ago
rebuilt
e45a4f96dd
last commit
2 years ago
rebuilt
ef44c41965
feat: add special education disaggregation
2 years ago
rebuilt
18ab51c860
chore: upgrade to rails 7.1.
...
upgrade rspec
fix failing tests
upgrade devise
2 years ago
rebuilt
2fd56047d4
Add disaggregation by ELL
2 years ago
rebuilt
490522eb1e
feat: support multiple columns for race and gender information
2 years ago
rebuilt
7bd7923d41
fix: ensure cleaner outputs columns for all survey items. Before the fix, if a survey item varient (ending in -1, ie s-tint-q1-1) did not have a matching survey item s-tint-q1, the resulting csv would not include that column
2 years ago
rebuilt
2ac30bb107
feat: Add income table to the database. Add seeder for income. Add a reference to income from survey item response. Update the loader to import income data from the survey response csv. Refactor analyze controller to extract presenter. Add corresponding specs. Add income graph to analyze page
2 years ago
rebuilt
a4332f6a05
chore: remove outdated admin data loader file. We now use Dese::Loader to load school level data
2 years ago
rebuilt
23ddaed2ce
feat: if admin data value is above 5, round down to 5
2 years ago
rebuilt
878ba08a22
fix: There was an n+1 problem where we looked up the list of schools for
...
every row. Now we query the list of schools just once per file
2 years ago
rebuilt
d025a83a2b
chore: remove errant comment
3 years ago
rebuilt
0f23053294
It's possible for admin data likert score values to be above 5. If that happens, we
...
cap the likert score at 5. This was happening already at the scraper
level but it's also now being done by the admin data loader for safety.
Also make sure to just update admin data instead of deleting and
reloading all values each load. Add tests to confirm this behavior
3 years ago
rebuilt
e058c523b6
Missing progress or duration information does not result in a row removed in the cleaning process
3 years ago
rebuilt
a71ebbc4e4
Add Overall Response Rate
3 years ago
rebuilt
dbfc9d1d3a
Add automated data cleaning. Modify SurveyItemValues class to use regex
...
instead of hard coded values. Produce a clean csv and a csv with all
the removed values and columns with reason for removal. Add script for
running cleaning for each project
3 years ago
rebuilt
65b8599c6e
Update logic for calculating student response rate. Remove references
...
to survey table. We no longer check or keep track of the survey type.
Instead we look in the database to see if a survey item has at least 10
responses. If it does, that survey item was presented to the respondent
and we count it, and all responses when calculating the response rate.
Remove response rate timestamp from caching logic because we no longer
add the response rate to the database. All response rates are calculated
on the fly
Update three_b_two scraper to use teacher only numbers
swap over to using https://profiles.doe.mass.edu/statereport/gradesubjectstaffing.aspx as the source of staffing information
3 years ago
rebuilt
8bd65d367b
make sure spec tests what it's supposed to test; that the value of the responses gets updated when a new information is loaded from another csv
3 years ago
rebuilt
282a671531
Change survey data loader spec to use factorybot objects instead of loading seeds. Change databasecleaner to use transaction. Add back babel-preset dependency to fix failing javascript test in production.
3 years ago
rebuilt
825259bdd8
Merge branch 'rpp-response-rate' into rpp-main to bring in improvements
...
to how we get enrollment and staffing information. Also speed up tests
3 years ago
rebuilt
6b31fa9115
Batch imports for staffing data
3 years ago
rebuilt
d059177f0c
load total students and batch importing records
3 years ago
Nelson Jovel
bfa5f28d7b
Convert dese::loader from using seeder to factories
3 years ago
Nelson Jovel
7a7b78a9e0
convert student loader from seeding to factories
3 years ago
rebuilt
2cf2b7d7c1
turn off some slow tests that don't add value
3 years ago
rebuilt
2362d884eb
Convert admin data loader from using seeder to using factory
3 years ago
rebuilt
d0219217de
Convert response rate loader spec from using the seeder to using the factory
3 years ago
rebuilt
06f9d2f0e9
Scrape enrollment and staffing information. Seed enrollment and staffing information. Update DatabaseCleaner so it cleans up leftover information in the database. Remove old admin csvs from codebase.
3 years ago
rebuilt
c0332955f3
move csv require statement to application.rb
3 years ago
rebuilt
7915a1bc7c
rename respondents to repondent
3 years ago