Commit graph

167 commits

Author SHA1 Message Date
Nelson Jovel
24dbf33bcd chore: remove unneeded comment 2023-12-11 15:55:00 -08:00
Nelson Jovel
f028e6c884 feat: if the filename includes the words 'form' or 'part' add that to the resulting cleaned filename 2023-12-11 15:39:20 -08:00
Nelson Jovel
3f44613085 chore: various fixes for race and gender categorization during cleaning.
Also add tests for race and gender categorization
2023-12-08 13:12:19 -08:00
Nelson Jovel
883731bce1 feat: Support SIS categories for race in the form of 'White and Asian'
or 'Black, Asian, and white'
2023-12-07 13:40:02 -08:00
Nelson Jovel
0931950eaf chore: make sure to output race and gender columns during cleaning 2023-12-07 13:39:25 -08:00
Nelson Jovel
3db0f9f757 fix: Print out message to make clean when there are duplicate headers
present in the raw survey file
2023-12-07 12:28:24 -08:00
Nelson Jovel
6541b87e9c feat: add 2023-24 academic year and make sure previous year enrollment
and staffing data get loaded when missing
2023-12-07 12:27:19 -08:00
Nelson Jovel
b7e670bb60 Lower threshold for the number of valid student responses from 17 to 11 2023-12-06 14:15:19 -08:00
Nelson Jovel
7dc881f654 chore: refactor code to put logic into models 2023-12-06 14:14:56 -08:00
Nelson Jovel
e325f38c43 Convert gender and race text into qualtrics codes during cleaning. Abide by 'prefer not to disclose' for self reported race. Give priority to self reported data but use SIS information as backup 2023-12-06 14:10:16 -08:00
Nelson Jovel
9efc1f41c6 chore: Add comment about StaffingLoader also cloning enrollment data when it clones staffing data 2023-12-06 14:09:54 -08:00
Nelson Jovel
f6f78bcd58 fix: make sure to grab the 'Gender- Qcode' column 2023-12-06 14:09:13 -08:00
Nelson Jovel
8bebe7db42 chore: since it's now possible for there to be multiple district and dese id columns separated by a dash and a number, be more explicit when we only want to filter out survey item ids that end in a -1 2023-11-06 14:52:21 -08:00
Nelson Jovel
b63c327d33 chore: when searching for dese id, split up pattern so that to be more explicit about the order in which to search out the columns that might have the dese ID we're looking for. 2023-11-06 13:15:50 -08:00
rebuilt
cddea60c8b feat: reduce number of rows to process at one time to reduce memory use 2023-11-02 12:21:32 -07:00
rebuilt
1a707eb6bc feat: load student responses in the same pass as loading the survey responses
chore: remove student loader since loading students is now done with the survey response loader
2023-11-02 09:52:39 -07:00
rebuilt
e3fbbabce5 feat: We no longer trust the progress number that gets exported from qualtrics. Instead during the cleaning progress, perform a manual count of the number of responses to filter out rows that don't meet the minimum threshold. 2023-10-27 15:12:24 -07:00
rebuilt
83661540b7 chore: upgrade to rails 7.1.
upgrade rspec

fix failing tests

upgrade devise
2023-10-11 10:58:52 -07:00
rebuilt
48e795fcfb feat: add special education disaggregation 2023-10-06 11:41:52 -07:00
rebuilt
060d7aa55a Add disaggregation by ELL 2023-09-29 19:29:23 -07:00
rebuilt
abea2cb8fa feat: support multiple columns for race and gender information 2023-08-25 15:37:20 -07:00
rebuilt
463e4c9452 fix: hide scores on analyze page for scores that don't meet the student threshold of 25% 2023-08-23 15:56:46 -07:00
rebuilt
714b90b3eb fix: ensure cleaner outputs columns for all survey items. Before the fix, if a survey item varient (ending in -1, ie s-tint-q1-1) did not have a matching survey item s-tint-q1, the resulting csv would not include that column 2023-08-23 15:30:47 -07:00
rebuilt
2321897283 fix: start fixing problem with variants not getting added to the cleaned csv 2023-08-23 15:30:47 -07:00
rebuilt
a785c69c44 Add Overall Response Rate 2023-08-09 15:13:58 -07:00
rebuilt
4afa030141 chore: remove precalculated race scores. Calculate race scores on every reload 2023-08-07 16:02:59 -07:00
rebuilt
f035c4d9ad fix: Filter out responses that don't correspond to the grades the school serves 2023-08-04 17:11:02 -07:00
rebuilt
5f49746bf4 feat: Rename income labels to 'Economically Disadvantaged' and 'Not Economically Disadvantaged' 2023-07-31 16:47:34 -07:00
rebuilt
67e469a66c feat: add popover to analyze graphs that displays the n-size of the different columns. Make sure to only calculate a score for a race if there are more than 10 respondents to a question. 2023-07-27 16:17:46 -07:00
rebuilt
cec48e55d3 chore: remove outdated admin data loader file. We now use Dese::Loader to load school level data 2023-07-21 12:16:59 -07:00
rebuilt
5c7729beeb feat: if admin data value is above 5, round down to 5 2023-07-21 12:14:46 -07:00
rebuilt
cbd5687ff0 feat: Add out of state admin data 2023-07-20 17:06:07 -07:00
rebuilt
4f035f6a63 feat: Add income table to the database. Add seeder for income. Add a reference to income from survey item response. Update the loader to import income data from the survey response csv. Refactor analyze controller to extract presenter. Add corresponding specs. Add income graph to analyze page 2023-07-07 09:14:36 -07:00
rebuilt
d72f8d31e0 fix: There was an n+1 problem where we looked up the list of schools for
every row. Now we query the list of schools just once per file
2023-06-26 11:36:03 -07:00
rebuilt
e8aa75bf66 feat: update survey_item_response table to indlude recorded date and import recorded date when loading responses 2023-06-23 11:26:53 -07:00
Nelson Jovel
0a2c5e02c5 feat: add ability to merge disaggregation data with raw survey data to
produce a cleaned csv with merged income disaggregation columns
2023-06-20 12:22:24 -07:00
rebuilt
e2d24a9bec Fix: make sure values don't get reordered after copying over row values from survey item variants. This fixes a problem where cleaner would produce a row with likert scores that got shifted to align with the wrong column 2023-06-08 09:26:56 -07:00
rebuilt
ddf9a628d5 Fix: enable correct detection of student surveys types by rejecting any
headers ending with '-1' (the variants of standard questions)
2023-06-07 12:38:58 -07:00
rebuilt
896f0d9961 Don't write a file if there's an empty dataset 2023-06-07 12:38:44 -07:00
rebuilt
76b79b99c2 Fix: Parse headers when they are surrounded by quotes. This helps load recent csv files correctly 2023-06-06 18:18:52 -07:00
rebuilt
30285efd69 It's possible for admin data likert score values to be above 5. If that happens, we
cap the likert score at 5.   This was happening already at the scraper
level but it's also now being done by the admin data loader for safety.
Also make sure to just update admin data instead of deleting and
reloading all values each load. Add tests to confirm this behavior
2023-06-03 17:14:41 -07:00
rebuilt
abe7a8804c Don't check standard deviation for early education surveys 2023-06-02 16:05:45 -07:00
rebuilt
9aeb5f92af Missing progress or duration information does not result in a row removed in the cleaning process 2023-06-02 15:23:21 -07:00
rebuilt
e3ae12b425 update response_date to recorded_date 2023-05-31 16:57:47 -07:00
rebuilt
a30921ce06 Add New Jersey enrollment and staffing data 2023-05-28 17:11:52 -07:00
rebuilt
93d087a5de Use short district name for cleaned csv 2023-05-28 17:11:27 -07:00
rebuilt
8ef8cfce58 Adjust valid duration threshold of short form items 2023-05-26 18:30:44 -07:00
rebuilt
d6b2521883 Fix regression in student loader 2023-05-19 13:48:16 -07:00
rebuilt
4509c157fa Add automated data cleaning. Modify SurveyItemValues class to use regex
instead of hard coded values.  Produce a clean csv and a csv with all
the removed values and columns with reason for removal. Add script for
running cleaning for each project
2023-05-16 13:38:29 -07:00
rebuilt
359e266a6c Remove unused TODOs 2023-04-27 15:47:45 -07:00