Commit graph

139 commits

Author SHA1 Message Date
Nelson Jovel
bbefdcb5bb chore: remove student loader since we load race information in data loader 2023-12-20 13:47:09 -08:00
Nelson Jovel
063810a9d7 chore: make sure to load races in data loader instead of student loader 2023-12-20 13:39:03 -08:00
Nelson Jovel
381625f58b chore: reenable test spec that tests data loader for races 2023-12-20 12:40:22 -08:00
Nelson Jovel
317fe3239a chore: refactor Race out of survey_item_values 2023-12-20 12:35:14 -08:00
Nelson Jovel
a295b8afb9 chore: refactor Gender out out of survey_item_values row 2023-12-20 12:33:18 -08:00
Nelson Jovel
0359dae88a chore: rename counts_by_grade to enrollment_by_grade for clarity 2023-12-18 15:59:30 -08:00
Nelson Jovel
ee5e53f992 chore: Make sure 'hispanic' column only gets applied when using SIS race information 2023-12-18 15:58:54 -08:00
Nelson Jovel
6f265302ea feat: if the filename includes the words 'form' or 'part' add that to the resulting cleaned filename 2023-12-18 15:58:40 -08:00
Nelson Jovel
fe039e3d04 chore: various fixes for race and gender categorization during cleaning.
Also add tests for race and gender categorization
2023-12-18 15:38:31 -08:00
rebuilt
a07728fcd6 feat: We no longer trust the progress number that gets exported from qualtrics. Instead during the cleaning progress, perform a manual count of the number of responses to filter out rows that don't meet the minimum threshold. 2023-10-27 15:13:17 -07:00
rebuilt
9d680c5159 chore: upgrade to rails 7.1.
upgrade rspec

fix failing tests

upgrade devise
2023-10-17 12:34:11 -07:00
rebuilt
acfdaf5587 feat: add special education disaggregation 2023-10-17 12:29:00 -07:00
rebuilt
5bdffec8f9 Add disaggregation by ELL 2023-10-17 12:06:19 -07:00
rebuilt
245cde85cd Add disaggregation by ELL 2023-10-17 11:18:47 -07:00
rebuilt
6db93cb116 feat: Rename income labels to 'Economically Disadvantaged' and 'Not Economically Disadvantaged' 2023-08-09 12:13:04 -07:00
rebuilt
a1b580048b chore: remove precalculated race scores. Calculate race scores on every reload 2023-08-08 15:54:15 -07:00
rebuilt
76ebcc6ef3 feat: Add income table to the database. Add seeder for income. Add a reference to income from survey item response. Update the loader to import income data from the survey response csv. Refactor analyze controller to extract presenter. Add corresponding specs. Add income graph to analyze page 2023-08-08 15:52:12 -07:00
rebuilt
7373e2e52f fix: Filter out responses that don't correspond to the grades the school serves 2023-08-08 11:46:14 -07:00
rebuilt
1265a164b9 feat: add popover to analyze graphs that displays the n-size of the different columns. Make sure to only calculate a score for a race if there are more than 10 respondents to a question. 2023-08-08 11:43:40 -07:00
rebuilt
e42aa05324 chore: remove outdated admin data loader file. We now use Dese::Loader to load school level data 2023-07-21 12:58:01 -07:00
rebuilt
8f276a5f1a feat: if admin data value is above 5, round down to 5 2023-07-21 12:55:39 -07:00
rebuilt
22cc303a95 fix: There was an n+1 problem where we looked up the list of schools for
every row. Now we query the list of schools just once per file
2023-06-26 11:25:32 -07:00
rebuilt
784e23982e feat: update survey_item_response table to indlude recorded date and import recorded date when loading responses 2023-06-23 11:28:14 -07:00
rebuilt
25a2698ac9 Fix: make sure values don't get reordered after copying over row values from survey item variants. This fixes a problem where cleaner would produce a row with likert scores that got shifted to align with the wrong column 2023-06-08 09:27:44 -07:00
rebuilt
f7c40c2da2 Fix: enable correct detection of student surveys types by rejecting any
headers ending with '-1' (the variants of standard questions)
2023-06-07 12:40:14 -07:00
rebuilt
2445642586 Don't write a file if there's an empty dataset 2023-06-07 12:40:03 -07:00
rebuilt
6b2bceceb6 Fix: Parse headers when they are surrounded by quotes. This helps load recent csv files correctly 2023-06-06 18:29:47 -07:00
rebuilt
ce76c979a4 Add scraper for 3B-i student/#courses ratio 2023-06-05 11:39:08 -07:00
rebuilt
904d0d2f2c It's possible for admin data likert score values to be above 5. If that happens, we
cap the likert score at 5.   This was happening already at the scraper
level but it's also now being done by the admin data loader for safety.
Also make sure to just update admin data instead of deleting and
reloading all values each load. Add tests to confirm this behavior
2023-06-03 16:06:50 -07:00
rebuilt
3589878700 Don't check standard deviation for early education surveys 2023-06-02 16:10:35 -07:00
rebuilt
89295f8832 Missing progress or duration information does not result in a row removed in the cleaning process 2023-06-02 15:13:53 -07:00
rebuilt
6022739f07 use district short name when writing filename 2023-05-31 17:12:40 -07:00
rebuilt
f749b96006 update response_date to recorded_date 2023-05-31 17:07:31 -07:00
rebuilt
9d0f8659f1 Adjust valid duration threshold of short form items 2023-05-26 19:01:33 -07:00
rebuilt
37e932e078 Fix regression in student loader 2023-05-24 12:04:39 -07:00
rebuilt
0dfc9726d0 Add automated data cleaning. Modify SurveyItemValues class to use regex
instead of hard coded values.  Produce a clean csv and a csv with all
the removed values and columns with reason for removal. Add script for
running cleaning for each project
2023-05-24 11:59:53 -07:00
rebuilt
a066f464c7 fix failing tests 2023-04-30 16:35:08 -07:00
rebuilt
f1022728fa Fix problem with dese scraper lumping in 2021-22 data as 2022-23 data.
Deleted unused csvs.  Turned off puts statements in admin loader.
Remove old, now unused admin data loader class.
2023-04-30 12:04:20 -07:00
rebuilt
798ba1f340 Only return files in sftp directory, not other directories 2023-04-30 11:55:45 -07:00
rebuilt
001d3083c8 Calculate response rate on the fly instead of looking it up from the db
when calculating response rates.
2023-04-22 14:03:22 -07:00
rebuilt
cee7aa4c59 Remove unused filename 2023-04-22 14:01:21 -07:00
rebuilt
d3a28f7635 Fix ThreeATwo scraper 2023-04-22 14:01:01 -07:00
rebuilt
07ed8dd259 Update logic for calculating student response rate. Remove references
to survey table.  We no longer check or keep track of the survey type.
Instead we look in the database to see if a survey item has at least 10
responses.  If it does, that survey item was presented to the respondent
and we count it, and all responses when calculating the response rate.

Remove response rate timestamp from caching logic because we no longer
add the response rate to the database. All response rates are calculated
on the fly

Update three_b_two scraper to use teacher only numbers

swap over to using https://profiles.doe.mass.edu/statereport/gradesubjectstaffing.aspx as the source of staffing information
2023-04-22 14:00:20 -07:00
rebuilt
357c7427d1 Batch imports for staffing data 2023-04-22 13:19:34 -07:00
rebuilt
d272e48adc load total students and batch importing records 2023-04-22 13:18:50 -07:00
rebuilt
cf2b2433e9 Use an sftp uri unique to MCIEA 2023-02-19 19:36:14 -08:00
rebuilt
a5da0fb0c6 Fix bug with not all survey responses loading when using sftp loader 2023-02-19 19:36:14 -08:00
rebuilt
ef087a6cd0 update default folder for survey responses 2023-02-19 19:36:14 -08:00
rebuilt
47c1856281 Process 1000 rows at a time to limit memory usage in production 2023-02-19 19:36:14 -08:00
rebuilt
640de1c8df Don't print sftptogo_url 2023-02-19 19:36:14 -08:00