Nelson Jovel
ee41751f4e
chore: correct parsing for 'not sped' and 'lep not first year'
2024-06-26 12:03:31 -07:00
Nelson Jovel
d041a5a567
chore: During cleaning, stop execution if grade column isn't found. Also stop execution if a duplicate header is found. Turn off spec for duplicate header check
2024-05-23 12:52:34 -07:00
Nelson Jovel
c4d4c35766
update parsing rules from glossary
2024-05-15 10:54:12 -07:00
Nelson Jovel
8e7fbdfb2c
add disaggregation glossary
2024-05-15 10:53:59 -07:00
Nelson Jovel
0a27538747
chore: add a test for categorizing sped values of 1 and 0 as 'Special Education' and 'Not Special Education'
2024-05-07 19:38:26 -07:00
Nelson Jovel
65c25fc3c7
Add ell income and sped parsing rules for SIS data. Add tests for the
...
new inputs.
2024-05-07 17:18:59 -07:00
Nelson Jovel
2561fa28fc
feat: Split academic year into seasons if the academic year's range is
...
initialized with a season, i.e. "2024-25 Fall". Update scapers for
admin data, enrollment and staffing to use the new range standard
correctly. Update the loaders for admin data, enrollment and staffing
so that it populates all seasons in a given year. So admin data for
2024-25 gets loaded into "2024-25 Fall" and "2024-25 Spring". Add tests
for the new range format. Set the default cutoff for the start of Spring season will be the last Sunday in February
2024-04-27 14:05:02 -07:00
Nelson Jovel
67ffc996a8
Revert "Split academic year into seasons if the academic year's range is"
...
This reverts commit a5d4cccb37 .
2024-04-26 13:48:30 -07:00
Nelson Jovel
a5d4cccb37
Split academic year into seasons if the academic year's range is
...
initialized with a season, i.e. "2024-25 Fall". Update scapers for
admin data, enrollment and staffing to use the new range standard
correctly. Update the loaders for admin data, enrollment and staffing
so that it populates all seasons in a given year. So admin data for
2024-25 gets loaded into "2024-25 Fall" and "2024-25 Spring". Add tests
for the new range format. Set the default cutoff for the start of Spring season will be the last Sunday in February
2024-04-26 13:31:50 -07:00
6fac759ec2
Faster admin data loader + rename School.school_hash
2024-04-22 15:43:54 -04:00
Nelson Jovel
9bfb76db5a
match an additional format for Dates. Supported dates are now '1/10/2022 14:21:45' '2022-1-10T14:21:45' '2022-1-10 14:21:45'
2024-03-01 09:30:43 -08:00
Nelson Jovel
c3cb05701f
feat: Support two date formats: ISO 8601 and the standard US date format
...
used in google sheets
2024-02-27 11:57:20 -08:00
Nelson Jovel
aa7af11a4e
fix: fix failing test
2024-02-23 11:54:57 -08:00
Nelson Jovel
cc8ed48204
fix: no longer support 'form' in filename when cleaning. Only look for 'part X' and add that to the filename if it exists
2024-02-22 12:02:25 -08:00
Nelson Jovel
d907e2742e
fix: fix failing tests
2024-02-22 12:02:15 -08:00
Nelson Jovel
a0c0b1d01d
chore: reenable test spec that tests data loader for races
2023-12-20 12:40:04 -08:00
Nelson Jovel
d4109fda6f
chore: refactor Race out of survey_item_values
2023-12-20 12:27:53 -08:00
Nelson Jovel
3e4ef9cb08
chore: refactor Gender out out of survey_item_values row
2023-12-20 12:27:44 -08:00
Nelson Jovel
6a24d4fa23
chore: Make sure 'hispanic' column only gets applied when using SIS race information
2023-12-18 15:26:21 -08:00
Nelson Jovel
ce43f52bd5
feat: if the filename includes the words 'form' or 'part' add that to the resulting cleaned filename
2023-12-18 15:25:36 -08:00
Nelson Jovel
a15b01a3e1
fix: instead of looking for 'asian' at the start of a word, look for it
...
after a word boundary. This means it still doesn't get confused with
caucasian and it's more flexible whan asian appears inside other text
such as 'Caucasian and Asian and Black'
2023-12-08 14:22:43 -08:00
Nelson Jovel
8a0ba0dbea
chore: various fixes for race and gender categorization during cleaning.
...
Also add tests for race and gender categorization
2023-12-08 14:22:33 -08:00
Nelson Jovel
2ef24caf70
Lower threshold for the number of valid student responses from 17 to 11
2023-12-06 13:56:14 -08:00
Nelson Jovel
b983f1e144
chore: fix categorization of gender
2023-12-01 15:32:48 -08:00
Nelson Jovel
f27a590c5a
Convert gender and race text into qualtrics codes during cleaning. Abide by 'prefer not to disclose' for self reported race. Give priority to self reported data but use SIS information as backup
2023-11-30 20:57:04 -08:00
Nelson Jovel
97ddb09167
chore: add test for checking duplicate headers during cleaning process
2023-11-09 14:50:51 -08:00
Nelson Jovel
6d84204f83
Add race and gender columns to cleaned cvs files when those headers are
...
missing
2023-11-06 20:30:51 -08:00
Nelson Jovel
a3f9e46414
chore: when searching for dese id, split up pattern so that to be more explicit about the order in which to search out the columns that might have the dese ID we're looking for.
2023-11-06 13:13:37 -08:00
rebuilt
019b954ffa
feat: load student responses in the same pass as loading the survey responses
...
chore: remove student loader since loading students is now done with the survey response loader
2023-11-02 11:38:03 -07:00
rebuilt
b2fdbe5756
feat: We no longer trust the progress number that gets exported from qualtrics. Instead during the cleaning progress, perform a manual count of the number of responses to filter out rows that don't meet the minimum threshold.
2023-10-27 15:12:58 -07:00
rebuilt
e45a4f96dd
last commit
2023-10-26 13:29:54 -07:00
rebuilt
ef44c41965
feat: add special education disaggregation
2023-10-24 13:05:57 -07:00
rebuilt
18ab51c860
chore: upgrade to rails 7.1.
...
upgrade rspec
fix failing tests
upgrade devise
2023-10-24 13:04:05 -07:00
rebuilt
2fd56047d4
Add disaggregation by ELL
2023-10-24 12:51:12 -07:00
rebuilt
490522eb1e
feat: support multiple columns for race and gender information
2023-10-24 10:27:39 -07:00
rebuilt
7bd7923d41
fix: ensure cleaner outputs columns for all survey items. Before the fix, if a survey item varient (ending in -1, ie s-tint-q1-1) did not have a matching survey item s-tint-q1, the resulting csv would not include that column
2023-10-24 10:24:57 -07:00
rebuilt
2ac30bb107
feat: Add income table to the database. Add seeder for income. Add a reference to income from survey item response. Update the loader to import income data from the survey response csv. Refactor analyze controller to extract presenter. Add corresponding specs. Add income graph to analyze page
2023-10-24 09:05:27 -07:00
rebuilt
a4332f6a05
chore: remove outdated admin data loader file. We now use Dese::Loader to load school level data
2023-07-21 12:52:18 -07:00
rebuilt
23ddaed2ce
feat: if admin data value is above 5, round down to 5
2023-07-21 12:51:18 -07:00
rebuilt
878ba08a22
fix: There was an n+1 problem where we looked up the list of schools for
...
every row. Now we query the list of schools just once per file
2023-06-26 11:38:33 -07:00
rebuilt
d025a83a2b
chore: remove errant comment
2023-06-12 16:06:07 -07:00
rebuilt
0f23053294
It's possible for admin data likert score values to be above 5. If that happens, we
...
cap the likert score at 5. This was happening already at the scraper
level but it's also now being done by the admin data loader for safety.
Also make sure to just update admin data instead of deleting and
reloading all values each load. Add tests to confirm this behavior
2023-06-03 16:47:03 -07:00
rebuilt
e058c523b6
Missing progress or duration information does not result in a row removed in the cleaning process
2023-06-02 15:18:03 -07:00
rebuilt
a71ebbc4e4
Add Overall Response Rate
2023-05-22 16:03:34 +00:00
rebuilt
dbfc9d1d3a
Add automated data cleaning. Modify SurveyItemValues class to use regex
...
instead of hard coded values. Produce a clean csv and a csv with all
the removed values and columns with reason for removal. Add script for
running cleaning for each project
2023-05-16 13:39:12 -07:00
rebuilt
65b8599c6e
Update logic for calculating student response rate. Remove references
...
to survey table. We no longer check or keep track of the survey type.
Instead we look in the database to see if a survey item has at least 10
responses. If it does, that survey item was presented to the respondent
and we count it, and all responses when calculating the response rate.
Remove response rate timestamp from caching logic because we no longer
add the response rate to the database. All response rates are calculated
on the fly
Update three_b_two scraper to use teacher only numbers
swap over to using https://profiles.doe.mass.edu/statereport/gradesubjectstaffing.aspx as the source of staffing information
2023-04-08 10:59:48 -07:00
rebuilt
8bd65d367b
make sure spec tests what it's supposed to test; that the value of the responses gets updated when a new information is loaded from another csv
2023-03-29 16:24:56 -07:00
rebuilt
282a671531
Change survey data loader spec to use factorybot objects instead of loading seeds. Change databasecleaner to use transaction. Add back babel-preset dependency to fix failing javascript test in production.
2023-03-29 15:45:48 -07:00
rebuilt
825259bdd8
Merge branch 'rpp-response-rate' into rpp-main to bring in improvements
...
to how we get enrollment and staffing information. Also speed up tests
2023-03-22 16:52:55 -07:00
rebuilt
6b31fa9115
Batch imports for staffing data
2023-03-08 04:51:15 -08:00