DA Film/TV RAID Automation System
About
This system was created to automate the extensive staff and faculty time being used creating and managing users and courses for the Film/TV department at De Anza.
Academic Term Lifecycle
To ensure smooth transitions between quarters, there are a few steps that need to be performed each academic term. This section should serve as a refresher to remind the system administrator of all steps needed to successfully create and remove an academic term. For developer documentation of the system and its components, please see the "Concepts and Components" section below.
1. Update Argos Report
Data from Banner is sent to the Middleware tier via Argos. To update this report, submit a ticket to ETS and request that the report be updated to use the next term. The report can be found at ETS -> Student_PROD_Scheduled -> Quarter Start -> Enrolled Class Roster -> DA F/TV Class Roster. Note that this action will only update the data placed on the Middleware tier; it will have no affect on the RAID unit until the build_term variable is changed in config.txt (see below).
2. Update config.txt
Note: Before updating this file, it's a good idea to verify that the cron job responsible for processing adds has been disabled. See "Disable the cron entry to stop processing adds" below.
It's important to ensure you do NOT use a Mac program such as Text Edit to update the config.txt file, as this can introduce unwanted ^M characters at the end of each line, which can break the system's input parser.
Settings to control the system are contained in the file /root/kmm_ftvautomation/config.txt on the RAID unit. Log into that system and update config.txt (e.g., using vi) to reference the new quarter. Quarters must be a 6-digit year in the following format. Note that the Academic Years used in scheduling do not follow the Calendar Year exactly, and use a "trailing year" numbering sequence that beings in summer. So the summer and fall quarters which occur in calendar year 2018 will fall in Academic Year 2019 and would be labeled as "Summer 2019" and "Fall 2019" in Banner.
Quarter Format: The 6-digit year takes the form of YYYYQC, where: YYYY is the 4-digit year (e.g., 2018); Q is the numeric representation of the quarter (1 for Summer, 2 for Fall, 3 for Winter, and 4 for Spring); C is the campus (1 for Foothill, 2 for De Anza.) Examples include: 201842 = Spring 2018 for De Anza. 201912 (Summer 2019 at De Anza - note that this takes place during calendar year 2018).
3. Wait for the nightly Middleware update
After both the Argos report and the config.txt file have been updated, you will need to wait 1 day for the Middleware system to process the new data and update the files on the RAID system.
4. Run create_term.pl
Once the Middleware server has updated the files on the RAID unit, you can run the create_term.pl script. Login to the RAID unit as root via SSH, then run the script as follows:
ftv-san:~# cd kmm_ftvautomation ftv-san:~/kmm_ftvautomation# ./create_term.pl -t 201912
After several seconds, you should see a prompt to restart samba. Press enter to do so. (Note that this will kick any users off the system for a 2 minutes or so while the samba4 service is restarted.)
5. When ready, email users
The command above will not email users during the account creation process. This is to ensure that students have an opportunity to meet with their professor and learn about the system before they are provided access. Once the faculty are ready, you may email all students with the following command. Note that you will only want to do this ONCE per quarter, as the system does not keep track of which users have already be notified (see the next step for information about handling adds.)
ftv-san:~# cd kmm_ftvautomation ftv-san:~/kmm_ftvautomation# ./create_term.pl -e
NOTE: You may want to change the text of the information sent to students at some point. You can do so by updating the file located at /root/kmm_ftvautomation/TEMPLATE_create_student.txt.
6. Enable cron to process adds
Since students may add the class after the course has started, it's important to ensure these accounts are created and that students are notified of their new account. Since processing adds results in a restart of the samba service, it's important that this task happen late at night/early in the morning. This task is accomplished via the cron system on the RAID unit. To enable the cron system, verify that the correct term is listed in the config.txt file (see above), the login to the web interface to the small tree raid and navigate to: System -> Advanced. Click the tab at the top labeled "Cron." Find the command with the Description "Process adds with Kevin's script" (if the script is disabled, it should appear in red text) and click the wrench icon at the far right of that row. NOTE: There are two scripts which look similar. One will automatically send emails when it creates accounts and the other will not; only one should be enabled at any time. At the top right corner of you screen, you should see a toggle switch labeled "Enable." Click that toggle and press the "Save" button at the bottom (NOTE: don't press "Run now" or the system will restart and kick all users off.) You may see a note at the top of the page to the effect that you need to Apply Changes for them to take effect. You can safely click the Apply Changes button without affecting users. Finally, it's a good idea to set a reminder on your calendar to disable this command during the third week of classes.
Running cron without emailing students: If you wish to enable the nightly add process, but don't want the system to email users (for example, if it's the week before classes start and you haven't yet sent the initial email blast from "When ready, email users" above, you can enable the last entry in the GUI cron list which is labeled "Process adds with Kevin's script - WILL NOT EMAIL!" Note that you will want to disable this cron job before enabling the one which will send email.
7. Disable the cron entry to stop processing adds
Once you reach the third week of classes, you may safely disable the cron entry, as faculty are not permitted to add students after the Quarterly Census. To disable the cron entry, follow the "Enable" instructions above, but this time toggle the switch to "Disable" instead. After you click the "Save" button at the bottom, you should notice the command on that row has switched from black text to red text.
8. Email faculty before removing the term
When the term is nearing its end, you will want to give faculty time to move any files they wish to keep out of the quarter folders.
9. After the term is over, delete all data with remove_term.pl
When the term is over and all faculty grades have been submitted, you can run the remove_term.pl script to remove old student accounts and clear out the term-specific stuff from the faculty folders. To prevent catastrophic data loss, this script does NOT use the contents of config.txt to determine which quarter it will delete. You will instead need to provide the term on the command line. By default, this script asks you to confirm before it deletes any files. Since each user has several files, this can take a LOT of time. Instead, you will almost certainly want to override this behavior by providing the -y option:
ftv-san:~# cd kmm_ftvautomation ftv-san:~/kmm_ftvautomation# ./remove_term.pl -t 201922 -y
After several minutes (this process can take a LONG time), you should see a prompt to restart samba. Press enter to do so. (Note that this will kick any users off the system for a 2 minutes or so while the samba4 service is restarted.)
10. Cleaning up the kmm_ftvautomation directory
From time to time, you may wish to remove the enrollment and roster data from the kmm_ftvautomation directory. To do so, you can run the following command, where '123456' is the term for which you wish to remove data:
ftv-san:~# cd kmm_ftvautomation ftv-san:~/kmm_ftvautomation# rm -i *123456*
This command will ask you to confirm each file before it is removed. This helps ensure you don't accidentally remove something you didn't intend to remove.
Concepts and Components
Data flow diagram
6. Custom Code
The code for this system is written in Perl. The source of truth repository is on github at https://github.com/fhda-ets/daftvraid . The production deployment resides on the server distance.deanza.fhda.edu in /opt/git/daftvraid/ . If any changes are made there, BE SURE TO PUSH THE CHANGES TO GITHUB! When the code is executed with the correct parameters, it will download new templates and settings from the RAID unit and use that data to process all data in the banner export files. A cron job runs on distance.deanza.edu under the kmetcalf account. Should you wish to change this account in the future, the cron job is:
[kmetcalf@distance ~]$ crontab -l
# EXECUTE THE DA FILM/TV RAID AUTOMATION SCRIPT EACH DAY AT 4:00 AM:
00 03 * * * cd /opt/git/daftvraid && /opt/git/daftvraid/daftvraid.pl -d 2>&1 >> /var/log/daftvraid
Troubleshooting and Lessons Learned
This area tracks things lessons learned from fixing things which have broken before.
Double Cron jobs
During Winter of 2019 the De Anza campus had a power outage which resulted in the SmallTree RAID being powered off an then back on again without going through the required shutdown process. This resulted in the server coming online in an unexpected state - both the "send email" and "don't send email" scripts ran at the start of the following term. This meant the system entered a race condition in which two processes were simultaneously creating user accounts and updating the config.xml file which stores users. As a result, the file was overwritten and the system was left in an unknown state in which user accounts were created in the BSD system, but this information was not registered in config.xml, preventing the users from actually logging in via smb.
The repair for this involved the following steps:
- Backup all data in the /mnt/tank0/Students folder.
- Deleting the corrupt term.
- Re-creating the term.
- Restoring student data from the initial backup.
- Fixing all permissions on the restored files.
To minimize the impact to instruction, this was done over a weekend. Note that the commands below are bookended by 'date' commands to track how long each task required.
Step 1: Backup all data in the /mnt/tank0/Students folder:
ftv-san:/mnt/tank0# mkdir backup_2019_04_20_Students ftv-san:/mnt/tank0# date && rsync -av Students/* backup_2019_04_20_Students/ && date
This process took about an hour to process 686 GB of student data.
Step 2: Deleting the corrupt term:
ftv-san:~/kmm_ftvautomation# date && ./remove_term.pl -t 201942 -y && date
As noted above, deleting data off the RAID requires a surprisingly long time. Removing this term took around 1.5 hours. Be aware that this step will result in SUBSTANTIAL numbers of warning messages from the remove_term.pl script. This is to be expected, as the system will be unable to find the text it's replacing in the config.xml file, which means the system will encounter many uninitialized values.
Step 3: Recreate the term:
ftv-san:~/kmm_ftvautomation# date && ./create_term.pl -t 201942 && date
This took about 1.5 hours.
Step 4: Restoring student data from initial backup:
ftv-san:~/kmm_ftvautomation# bash [root@ftv-san ~/kmm_ftvautomation]# cd /mnt/tank0/Students/ [root@ftv-san /mnt/tank0/Students]# date && for d in * ; do rsync -rulv --exclude '.*' ../backup_2019_04_20_Students/$d/ $d/; done && date
This took 35 minutes.
Step 5: Fix permissions on restored files:
[root@ftv-san /mnt/tank0/Students]# for d in * ; do find $d/ -user root -type f | grep -Ev 'shared$|submissions$' | xargs -I {} -L 1 echo chown $d \'{}\' >> ~/kmm_ftvautomation/juststudentperms.sh ; done [root@ftv-san /mnt/tank0/Students]# bash ~/kmm_ftvautomation/juststudentperms.sh
This took a couple of minutes, but didn't completely work as expected. Several files contained unexpected quotes or other non-printable characters which kind of broke things. To get a list of everything you'll need to manually process, run the following:
[root@ftv-san /mnt/tank0/Students]# find . -user root -type f | grep -Ev 'shared$|submissions$' | cut -d '/' -f 2,3,4 | sort -u
For each result returned, take advantage of the -R feature in chown and do things like this:
[root@ftv-san /mnt/tank0/Students]# chown -R 1111111 "1111111/some guy's folder/"
This manual process took 30-45 minutes.