How to Automate SFTP File Transfers with Python: A Complete Guide
Automating file transfers is one of the most common tasks in data engineering, DevOps, and system integration workflows. Whether you're syncing data between systems, backing up files to remote servers, or building ETL pipelines, SFTP provides a secure and reliable way to move files programmatically.
Python makes SFTP automation straightforward thanks to the Paramiko library, which implements the SSH protocol and provides a clean API for file operations. In this guide, we'll walk through everything you need to know to build production-ready SFTP automation scripts.
We'll cover authentication, uploading and downloading files, error handling, logging, and best practices for deploying automated file transfer workflows.
Why Python for SFTP automation?
Python has become the go-to language for file transfer automation for several good reasons:
Rich Ecosystem: Libraries like Paramiko, pysftp, and fabric make SSH and SFTP operations simple and Pythonic.
Easy Integration: Python integrates seamlessly with cloud services, databases, data processing frameworks, and orchestration tools like Airflow and Prefect.
Cross-Platform: Python scripts run on Linux, Windows, and macOS without modification, making deployment flexible.
Readable Syntax: Python's clean syntax makes scripts easy to understand, maintain, and hand off to other team members.
Setting up your environment
Before writing code, you'll need to install the necessary Python packages. The most widely used library for SFTP in Python is Paramiko.
Installing Paramiko:
pip install paramikoFor production environments, it's best to specify versions in a requirements.txt file:
paramiko==3.4.0
cryptography==42.0.0Basic SFTP connection
Let's start with a simple script that connects to an SFTP server using SSH key authentication:
import paramiko
import os
# Connection details
hostname = 'sftp.example.com'
port = 22
username = 'your_username'
private_key_path = os.path.expanduser('~/.ssh/id_rsa')
# Create SSH client
ssh = paramiko.SSHClient()
ssh.set_missing_host_key_policy(paramiko.AutoAddPolicy())
try:
# Load private key
private_key = paramiko.RSAKey.from_private_key_file(private_key_path)
# Connect to server
ssh.connect(hostname, port=port, username=username, pkey=private_key)
# Open SFTP session
sftp = ssh.open_sftp()
print("Successfully connected to SFTP server!")
# Your file operations go here
# Close connections
sftp.close()
ssh.close()
except Exception as e:
print(f"Error: {e}")
finally:
if 'sftp' in locals():
sftp.close()
if 'ssh' in locals():
ssh.close()Uploading files
Once connected, uploading files is straightforward. Here's a function that uploads a single file with error handling:
def upload_file(sftp, local_path, remote_path):
"""Upload a file to SFTP server"""
try:
sftp.put(local_path, remote_path)
print(f"Uploaded: {local_path} -> {remote_path}")
return True
except FileNotFoundError:
print(f"Error: Local file not found: {local_path}")
return False
except PermissionError:
print(f"Error: Permission denied on remote path: {remote_path}")
return False
except Exception as e:
print(f"Error uploading file: {e}")
return False
# Example usage
upload_file(sftp, '/local/data/report.csv', '/remote/uploads/report.csv')Downloading files
Downloading works similarly, but in reverse:
def download_file(sftp, remote_path, local_path):
"""Download a file from SFTP server"""
try:
# Create local directory if it doesn't exist
os.makedirs(os.path.dirname(local_path), exist_ok=True)
sftp.get(remote_path, local_path)
print(f"Downloaded: {remote_path} -> {local_path}")
return True
except FileNotFoundError:
print(f"Error: Remote file not found: {remote_path}")
return False
except Exception as e:
print(f"Error downloading file: {e}")
return False
# Example usage
download_file(sftp, '/remote/data/export.csv', '/local/downloads/export.csv')Working with directories
Real-world automation often involves processing multiple files. Here's how to list, filter, and batch process files:
def list_files(sftp, remote_dir, pattern='*'):
"""List files in a remote directory"""
import fnmatch
try:
files = sftp.listdir(remote_dir)
matching_files = [f for f in files if fnmatch.fnmatch(f, pattern)]
return matching_files
except Exception as e:
print(f"Error listing files: {e}")
return []
def download_directory(sftp, remote_dir, local_dir, pattern='*.csv'):
"""Download all matching files from a directory"""
files = list_files(sftp, remote_dir, pattern)
downloaded = []
failed = []
for filename in files:
remote_path = f"{remote_dir}/{filename}"
local_path = f"{local_dir}/{filename}"
if download_file(sftp, remote_path, local_path):
downloaded.append(filename)
else:
failed.append(filename)
print(f"Downloaded {len(downloaded)} files, {len(failed)} failed")
return downloaded, failed
# Example: Download all CSV files from a directory
download_directory(sftp, '/remote/exports', '/local/data', pattern='*.csv')Common SFTP Operations
Production-ready SFTP client class
For real-world use, it's better to wrap SFTP operations in a reusable class with proper logging and error handling:
import paramiko
import logging
from pathlib import Path
class SFTPClient:
def __init__(self, hostname, username, port=22, key_path=None, password=None):
self.hostname = hostname
self.port = port
self.username = username
self.key_path = key_path
self.password = password
self.ssh = None
self.sftp = None
self.logger = logging.getLogger(__name__)
def connect(self):
"""Establish SFTP connection"""
try:
self.ssh = paramiko.SSHClient()
self.ssh.set_missing_host_key_policy(paramiko.AutoAddPolicy())
if self.key_path:
private_key = paramiko.RSAKey.from_private_key_file(self.key_path)
self.ssh.connect(self.hostname, port=self.port,
username=self.username, pkey=private_key)
else:
self.ssh.connect(self.hostname, port=self.port,
username=self.username, password=self.password)
self.sftp = self.ssh.open_sftp()
self.logger.info(f"Connected to {self.hostname}")
return True
except Exception as e:
self.logger.error(f"Connection failed: {e}")
return False
def disconnect(self):
"""Close SFTP connection"""
if self.sftp:
self.sftp.close()
if self.ssh:
self.ssh.close()
self.logger.info("Disconnected from SFTP server")
def upload(self, local_path, remote_path):
"""Upload a file"""
try:
self.sftp.put(str(local_path), str(remote_path))
self.logger.info(f"Uploaded {local_path} to {remote_path}")
return True
except Exception as e:
self.logger.error(f"Upload failed: {e}")
return False
def download(self, remote_path, local_path):
"""Download a file"""
try:
Path(local_path).parent.mkdir(parents=True, exist_ok=True)
self.sftp.get(str(remote_path), str(local_path))
self.logger.info(f"Downloaded {remote_path} to {local_path}")
return True
except Exception as e:
self.logger.error(f"Download failed: {e}")
return False
def __enter__(self):
self.connect()
return self
def __exit__(self, exc_type, exc_val, exc_tb):
self.disconnect()
# Usage with context manager
with SFTPClient('sftp.example.com', 'username',
key_path='~/.ssh/id_rsa') as client:
client.upload('/local/file.csv', '/remote/file.csv')
client.download('/remote/data.json', '/local/data.json')Security best practices
When automating SFTP operations in production, security is critical. Follow these best practices:
Do This
Avoid This
Using environment variables for credentials
Never hardcode credentials in your scripts. Use environment variables or a secrets management system:
import os
from dotenv import load_dotenv
# Load environment variables from .env file
load_dotenv()
# Get credentials from environment
hostname = os.getenv('SFTP_HOST')
username = os.getenv('SFTP_USER')
key_path = os.getenv('SFTP_KEY_PATH')
# Use with your SFTP client
with SFTPClient(hostname, username, key_path=key_path) as client:
client.upload('data.csv', '/uploads/data.csv')Create a .env file (and add it to .gitignore):
SFTP_HOST=sftp.example.com
SFTP_USER=your_username
SFTP_KEY_PATH=/home/user/.ssh/id_rsaScheduling automated transfers
Once your script is ready, you need to schedule it to run automatically. Here are the most common approaches:
Scheduling Options
Example Cron Job (runs daily at 2 AM):
0 2 * * * /usr/bin/python3 /home/user/scripts/sftp_sync.py >> /var/log/sftp_sync.log 2>&1Error handling and retry logic
Network issues, server downtime, and file locks can cause transfers to fail. Implement retry logic to make your automation resilient:
import time
from functools import wraps
def retry(max_attempts=3, delay=5):
"""Decorator to retry a function on failure"""
def decorator(func):
@wraps(func)
def wrapper(*args, **kwargs):
attempts = 0
while attempts < max_attempts:
try:
return func(*args, **kwargs)
except Exception as e:
attempts += 1
if attempts >= max_attempts:
raise
print(f"Attempt {attempts} failed: {e}")
print(f"Retrying in {delay} seconds...")
time.sleep(delay)
return wrapper
return decorator
# Use the decorator on your upload function
@retry(max_attempts=3, delay=10)
def upload_with_retry(client, local_path, remote_path):
return client.upload(local_path, remote_path)Logging and monitoring
Proper logging is essential for debugging and monitoring automated transfers:
import logging
from datetime import datetime
# Configure logging
logging.basicConfig(
level=logging.INFO,
format='%(asctime)s - %(name)s - %(levelname)s - %(message)s',
handlers=[
logging.FileHandler(f'sftp_transfer_{datetime.now():%Y%m%d}.log'),
logging.StreamHandler()
]
)
logger = logging.getLogger(__name__)
# Use throughout your script
logger.info("Starting SFTP sync job")
logger.debug(f"Connecting to {hostname}")
logger.error(f"Failed to upload {filename}: {error}")
logger.info("SFTP sync job completed successfully")Complete example: Automated daily backup
Here's a complete example that ties everything together — a script that backs up local files to a remote SFTP server daily:
#!/usr/bin/env python3
"""
Daily SFTP backup script
Uploads files from local directory to remote SFTP server
"""
import os
import logging
from datetime import datetime
from pathlib import Path
from sftp_client import SFTPClient # Your SFTPClient class
# Configure logging
logging.basicConfig(
level=logging.INFO,
format='%(asctime)s - %(levelname)s - %(message)s',
handlers=[
logging.FileHandler(f'backup_{datetime.now():%Y%m%d}.log'),
logging.StreamHandler()
]
)
logger = logging.getLogger(__name__)
def backup_files():
"""Main backup function"""
# Configuration
local_dir = Path('/data/exports')
remote_dir = '/backups'
hostname = os.getenv('SFTP_HOST')
username = os.getenv('SFTP_USER')
key_path = os.getenv('SFTP_KEY_PATH')
logger.info("Starting backup job")
# Find files to backup (last 24 hours)
files_to_backup = [
f for f in local_dir.glob('*.csv')
if (datetime.now() - datetime.fromtimestamp(f.stat().st_mtime)).days == 0
]
logger.info(f"Found {len(files_to_backup)} files to backup")
# Connect and upload
with SFTPClient(hostname, username, key_path=key_path) as client:
success_count = 0
failed_count = 0
for local_file in files_to_backup:
remote_path = f"{remote_dir}/{local_file.name}"
if client.upload(local_file, remote_path):
success_count += 1
else:
failed_count += 1
logger.info(f"Backup complete: {success_count} succeeded, {failed_count} failed")
return success_count, failed_count
if __name__ == '__main__':
try:
backup_files()
except Exception as e:
logger.error(f"Backup job failed: {e}")
raisePerformance optimization tips
For large-scale file transfers, consider these optimizations:
- Parallel transfers: Use Python's multiprocessing or threading to upload/download multiple files simultaneously
- Compression: Enable compression for text-based files to reduce transfer time
- Resume capability: Implement checkpointing to resume interrupted transfers
- Batch operations: Group small files and transfer them as archives when possible
- Connection pooling: Reuse SSH connections when transferring many files
Common pitfalls to avoid
Watch out for these common mistakes:
- Not closing connections: Always use context managers or try/finally blocks to ensure connections are closed
- Path inconsistencies: Be careful with Windows vs. Linux path separators. Use Path objects or pathlib
- No file verification: Check file sizes or checksums after transfer to verify integrity
- Insufficient error handling: Network failures are common — always implement retry logic
- Ignoring permissions: Ensure your SFTP user has proper read/write permissions on remote directories
Conclusion
Python and Paramiko make SFTP automation straightforward and reliable. By following the patterns and best practices in this guide, you can build production-grade file transfer scripts that handle errors gracefully, log properly, and integrate seamlessly into your data pipelines.
Start with a simple connection and file transfer, then gradually add error handling, logging, retry logic, and scheduling as your needs grow. Remember to prioritize security by using SSH keys, protecting credentials, and validating host keys.
Whether you're automating daily backups, building ETL workflows, or integrating with partner systems, Python SFTP automation provides the flexibility and control you need without the complexity of enterprise tools.