I need to setup a process for easily sorting worksheets completed by students. The basic idea is, I’ll collect their work, run it through a feed scanner, which sends me a PDF of all of the pages. Using ImageMagick, I’ll split the PDF, during then into JPEGs. All of these steps above I can achieve. The next is where I need help:

First, it needs to scan something I placed on EVERY page. The script must check a barcode (or similar marking) at the top of each JPEG, indicating the page number. E.g. if it sees page 1a, it sends that to a directory called "1a". This could be displayed as a barcode, or maybe there is software that can recognize the page number?

Next, it would need to scan something students placed on their page, identifying which student filled in the worksheet. The script needs to identify which students completed the worksheet, and sort them. For this, the students will indicate their name or perhaps an assigned number. Since I doubt any software can perform hand-recognition on students with many different hand-writings, this might be some kind of scan-tron like area where students fill in bubbles. E.g. I assigned a student as #130, so they fill in bubbles indicating 130, and then the script sends that JPEG to directory "1a/130/".

Does anyone have any ideas what tools I can use for scanning the information off the page?

