Hi there! It’s been a while.
Today I want to present to you, my cheap-ish solution to convenient mass document scanning to paperless-ngx or other DMS.
My use-case
I wanted to archive lots of documents that I posess into paperless-ngx. But using the ADF in the multi function printer I already have was not an option. For one it is slow, and the other point is, that it can’t scan both sides automatically. So I looked into a few options. Proper ADF document scanners are expensive. And if you want one that has scan to SMB-Share capabilities builtin, they double in price. And as these things are incredibly useful, they stay that expensive even on the used market.
I came across some ebay listings for used Fujitsu Fi 6130 scanners. Mine came from Poland for about 65β¬ which is a very good price. First I was a bit suspicious because the next cheapest option did cost twice as much, but this thing was in immaculate condition. Maybe you could also ask some friends who work in IT because afterwards I learned that I could have gotten one for free since a lot of communities around me are ditching their USB-Document scanners for newer networked versions, so they have a loooot of them in storage waiting to be scrapped.
I still think the 65β¬ were a good investment since mine doesn’t have many pages on the counter and works absolutely flawless. Who knows what I would have gotten if I went for the free option from a town office where they scanned thousands upon thousands of paper documents each year.
Paperless-NGX is a nice piece of software. It expects documents in a consume folder, which I mapped to a networked share. There are some minor technicalities to why I’ve done things in my following “cookbook” the way they are. Here are my requirements on how I want to scan my documents:
- I want as little manual interaction as possible
- I don’t want to scan one document at a time
- I don’t want blank pages amongst the documents in paperless
From what I found it is best to not use the PDF function on scanimage because it produces a PDF with just a bunch of white pages. I did not get it to work in any reliable way, so here is what I did in a nutshell:
- I create a new subfolder
- I scan all pages in a loop to separate tiff files
- I merge them using imagemagicks convert tool into a pdf while also reducing filesize using jpeg compression.
This manual is to scan documents at the push of a button. Happy scanning π
The How-To:
Preparation
Step 1: Install updates and packagessudo apt update qpdf
sudo apt upgrade
sudo apt install sane cifs-utils scanbd imagemagick
Step 2: Create ramdisk add the last two lines to your crontab (replace the placeholder with your scanuser)mkdir /var/scans
crontab -e
@reboot mount -t tmpfs -o size=400M /var/scans;chown <scanuser>:scanner /var/scans;chmod 775 /var/scans;
Step 3: Add a corresponding line to /etc/fstabramdisk /var/scans tmpfs defaults,size=400M,x-gvfs-show 0 0
Step 4: Disable the swapfile so your tmpfs resides truly in ram and not in a swapfile like so:sudo dphys-swapfile swapoff
Step 5: Reboot
Config
Step 1: In /etc/sane.d/dll.conf comment all the driver names except net in this example fujitsu. If you have another scanner, chekc the sane website which driver suits your model.
Step 2: Plug in your scanner and run lsusb to check if your scanner was recognized by the system
Step 3: Create a udev rule in /etc/udev/rules.d/:sudo vi 50-scanner.rulesSUBSYSTEM==βusbβ, ATTRS{idVendor}==β04c5β³, ATTRS{idProduct}==β114fβ, GROUP=βscannerβ
Step 4: Add your scanner user to the scanner group:sudo usermod -a -G scanner <scanuser>
Step 5: Start sane automatically and run as your scan user:
Edit: /etc/default/saned
Set RUN=yes and RUN_AS_USER=<scanuser>
Step 6: Edit the unitfile of scanbd like this:sudo systemctl edit --full scanbd
[Unit]
Description=Scanner button polling Service
[Service]
Type=simple
Group=scanner #GROUP add this line with your scanner group
ExecStart=/usr/sbin/scanbd -f -c /etc/scanbd/scanbd.conf
#ExecReload=?
Environment=SANE_CONFIG_DIR=/etc/scanbd
StandardInput=null
StandardOutput=syslog
StandardError=syslog
#NotifyAccess=?
[Install]
WantedBy=multi-user.target
#Also=scanbm.socket #COMMENT this or the scanner will timeout
Alias=dbus-de.kmux.scanbd.server.service
Step 7: Trigger udev rules:sudo udevadm control --reload sudo udevadm trigger
Step 8: Check your scanner:sane-find-scanner scanimage -L
Step 9: Edit the Imagemagick policy in /etc/ImageMagick-X/policy.xml
Comment the line that says pattern=”PDF”
Step 10: Edit /etc/scanbd/scanbd.conf in the section that corresponds to your scanners action button. Mine is “email”, so I edit the script line in the secition “action email” to say:
script = “/opt/scan-to-share“
To know how your action button is called, you can runscanimage -A
This will print the names of the action buttons in the “Sensors:” section.
If you don’t get any output, make sure the service scanbd isn’t running.
Step 11: This is my scan-script that I placed in /opt/scan-to-share
I can’t say if this suits your needs but mine basicalls makes sure that every page scanned in duplex is merged with all the others so it can be consumed by paperless-ngx as a single document and then separated again using PATCH-T codepages. This will make sure that regardless of how many documents and or pages I put in the scanner, my DMS will always get it as a single PDF file and can process it however it sees fit.
Make sure your script is executable by the scanner group an your scan user.
Things worthy of note: This script does it in an incredibly complicated way. Normally I’d expect scanimage to work with the format PDF but for some unknown reason it did not for me. Also doing it with all these loops prevents the Raspberry Pi from memory exhaustion so the script does not crap out on me. So far it proved very reliable.
#! /bin/bash
FILENAME_PREFIX="scan_"
FOLDERNAME_PREFIX="scan_"
FOLDERCOUNT=$(ls -lad /var/scans/* | wc -l)
FOLDER="/var/scans/${FOLDERNAME_PREFIX}${FOLDERCOUNT}"
mkdir $FOLDER
chmod 775 $FOLDER
cd $FOLDER
scanimage -d "net:localhost:fujitsu:fi-6130dj:136899" --source "ADF Duplex" --format tiff --resolution 200 --brightness 25 --contrast 25 --mode gray --batch=scan_%d.tiff
chmod -R 775 $FOLDER
TIFFCOUNT=$(ls -la $FOLDER/scan_* | wc -l)
i=1
while [ "$i" -le "$TIFFCOUNT" ]
do
convert scan_$i.tiff -quality 60 -compress jpeg scan_$i.pdf
i=$((i+1))
done
y=1
QPDFPAGES=""
while [ "$y" -le "$TIFFCOUNT" ]
do
QPDFPAGES="${QPDFPAGES}scan_${y}.pdf 1 "
y=$((y+1))
done
qpdf --empty --pages $QPDFPAGES -- scan.pdf
FILECOUNT=$(ls -la /mnt/scan/scan_* | wc -l)
cp scan.pdf "/mnt/scan/scan_${FILECOUNT}.pdf"
rm -R $FOLDER
Step 12: Create your scan-target-share’s mount foldersudo mkdir /mnt/scan
Step 13: Edit /etc/fstab to mount your smb fileshare by adding a line like this.
Take care! The uid matches my saned user which runs scanbd. You can find your uid in /etc/passwd. Also the gid matches my scanner group which you can find in /etc/group.//servername/scanshare /mnt/scan cifs user,uid=105,gid=111,vers=3.11,credentials=/root/FreenasCredentials,rw,auto 0 0
Step 14: Create a file with your shares credentials in /root and secure it with appropriate permissions.The content should look like this:username=<share user name> password=<password of your shares user>
Step 15: Also edit /etc/scanbd/dll.conf and comment out all the drivers that are not for your scanner to make it faster.
Step 16: Next disable the scanbm socket like so:
sudo systemctl stop scanbm.socket
sudo systemctl disable scanbm.socket
Step 17: Restart your system and begin your scanning

Thanks for this guide. It’s a little unclear what machine you’re running this on, a raspberry pi connected via USB?
Hi HarvsG,
I’m using a raspberry pi connected via USB to the scanner.
This is also where everything is executed.
Hey there, thank you for this tutorial. I just purchased a scanner from poland for 65 Euro too. Maybe it’s the same shop! π
How did you manage to remove blank pages from your scans? You wrote it in your requirements but in the tutorial is no configuration for that.
Does it work out of the box with these printer drivers?
Hi Tim,
I’m glad you liked it π
I think so, I think it is the only store on ebay that sells it for that price currently.
Oh the removal of the blank pages is done in paperless-ngx using a pre-consume script.
This forum post details it quite well:
https://github.com/paperless-ngx/paperless-ngx/discussions/668
I pretty much copy-pasted it.
You have to play around with the threshold value a bit. 0.002 was way too low for me, I use 0.9
But you have to figure that out for your self what works best for your kind of documents.