Core Concepts
SFTPSync is a platform that automates file-based workflows by receiving, validating, transforming, and routing structured data files. This section explains the core concepts of the platform and how they work together.
1. Platform Overview
SFTPSync is designed to solve the challenge of managing file transfers and data integrations across multiple clients, systems, and file formats. It provides a centralized platform where you can:
- Create dedicated SFTP spaces for each client or data source
- Define schemas that validate incoming data files
- Build automated pipelines that transform files into standardized formats
- Send processed data to destination systems via webhooks or API
- Monitor file processing and alert on errors
2. Clients
In SFTPSync, a Client represents an organization or business entity that interacts with your system through file transfers. Each client has:
- Dedicated SFTP space: A secure environment where the client can upload files
- SFTP credentials: Username and password for connecting to the SFTP server
- SFTP host: The server address displayed on the client's page for connection
- Client ID: A unique identifier used in API calls and data routing
- Associated pipelines: Workflows that process files uploaded by this client
Clients are isolated from each other, ensuring data privacy and security. Clients can connect to their dedicated SFTP space using their assigned username and password with the host specified on their configuration page. You can manage multiple clients from a centralized dashboard, monitoring their file transfers and pipeline executions.
// Example client object
{
"clientId": "c4b3f495-5dfc-4a91-b604-a8e66ab4a220",
"clientName": "acme_logistics",
"displayName": "Acme Logistics Inc.",
"status": "active",
"created": "2023-05-15T10:30:00Z",
"lastActive": "2023-06-01T08:45:22Z",
"sftpUsername": "acme_sftp"
}
3. Schemas
A Schema defines the structure and validation rules for data files. It specifies:
- Fields: The columns or properties expected in the data
- Data types: The expected type for each field (string, number, date, etc.)
- Validation rules: Requirements for each field (required/optional, format, range, etc.)
Schemas help ensure data quality by rejecting files that don't meet your specifications. They define what data is expected and its format, but do not handle transformations (which are handled separately by pipelines).
// Example schema definition
{
"schemaId": "ord-schema-v1",
"name": "Order Schema v1",
"fileType": "csv",
"fields": [
{
"name": "order_id",
"type": "string",
"required": true,
"validation": {
"pattern": "^ORD-[0-9]{6}$"
}
},
{
"name": "customer_email",
"type": "string",
"required": true,
"validation": {
"format": "email"
}
},
{
"name": "order_date",
"type": "date",
"required": true,
"sourceFormat": "MM/DD/YYYY",
"targetFormat": "YYYY-MM-DD"
},
{
"name": "total_amount",
"type": "number",
"required": true,
"validation": {
"min": 0
}
}
]
}
4. Pipelines
A Pipeline in SFTPSync defines how files are processed when uploaded to a specific folder. Each pipeline includes:
- Schema: The file structure we're mapping to (defines expected data format)
- Webhook: Optional notification when new files are uploaded
- Starter file: Template that defines the expected header structure for Excel/CSV files
- Mappings: Column mappings (both automatic and manual) from source to target schema
- Transformations: Data manipulations that can be applied to specific columns
Each pipeline creates a dedicated folder in the client's SFTP space. When a file is uploaded to this folder, it automatically triggers the processing based on the defined mappings and transformations.
// Example pipeline configuration
{
"options": {
"delimiter": ",",
"skipHeaderRow": true
},
"fieldMappings": [
{
"source": "customer_id",
"target": "id"
},
{
"source": "customer_name",
"target": "name"
},
{
"source": "customer_email",
"target": "email",
"transform": "toLowerCase"
},
{
"source": "customer_phone",
"target": "phone",
"transform": "formatPhoneNumber"
}
],
"transformations": {
"toLowerCase": "function(value) { return value.toLowerCase(); }",
"formatPhoneNumber": "function(value) { return value.replace(/[^0-9]/g, ''); }"
}
}
5. Typical Workflow
Here's a typical workflow in SFTPSync that illustrates how clients, schemas, and pipelines work together:
- Client Setup: You create a new client in SFTPSync, which generates SFTP credentials (username and password) and a dedicated SFTP space.
- Schema Definition: You define a schema that specifies the target data structure you want to receive after processing (the fields and their data types).
- Pipeline Creation: You create a pipeline and associate it with the schema. This automatically creates a dedicated folder in the client's SFTP space.
- Starter File Setup: You upload a template file with the expected header structure for the files your client will upload.
- Field Mapping: You define mappings between source columns in the uploaded files and target fields in your schema.
- Transformation Setup: You create JavaScript transformation functions to apply to specific fields (e.g., formatting phone numbers, converting to lowercase).
- Webhook Configuration: You set up an optional webhook URL to receive notifications when files are processed.
- File Upload: The client uploads a file to their dedicated pipeline folder in the SFTP space.
- Automatic Processing: SFTPSync detects the new file, applies the defined mappings and transformations to convert the data to the target schema format.
- Notification: If a webhook is configured, a notification is sent to the specified URL with information about the processed file.
Example Use Case: Logistics Company Order Processing
A logistics company receives order files from multiple retail clients, each using different file formats and structures. Using SFTPSync, they:
- Create a separate client account for each retailer
- Define schemas for each retailer's specific file format
- Configure pipelines that transform these diverse formats into a standardized JSON structure
- Set up webhooks that send the standardized data to their order management system
As a result, regardless of which retailer sends an order file or what format they use, the logistics company's system receives data in a consistent, standardized format. This eliminates manual processing, reduces errors, and speeds up order fulfillment.
Key Benefits
- Elimination of manual file processing (75% time savings)
- Standardization of data across multiple sources
- Automatic validation preventing bad data from entering systems
- Centralized monitoring of all file transfers
- No developer involvement needed for new client onboarding
SftpSync Integration Documentation
This guide explains how to integrate with SftpSync through webhooks and the REST API.
Core Concepts
When files are processed in SftpSync, our system can notify your application through webhooks. Follow these steps to implement a webhook listener:
1. Platform Overview
SftpSync webhooks send HTTP POST requests to your specified endpoint when certain events occur:
- GENERAL: Sent for all file processing events, including successful processing and error situations
2. Webhook Request Structure
Below is an example of the JSON payload sent to your webhook endpoint.
{
"event": "GENERAL",
"timestamp": "2025-05-19T21:30:00.000Z",
"data": {
"fileId": "550e8400-e29b-41d4-a716-446655440000",
"filename": "example_data.csv",
"clientName": "acme_inc",
"status": "completed",
"processedFilename": "example_data_processed.csv",
"jsonFilename": "example_data.json",
"size": 24680,
"processedAt": "2025-05-19T21:29:55.000Z"
}
}
3. Schemas
All webhook requests include a signature to verify authenticity:
- Requests contain an
x-sftpsync-signature
header - The signature is an HMAC-SHA256 hash of the request body using your webhook secret
- The secret can be found in the webhook configuration section of your dashboard by clicking "View Configuration" and then "Reveal Secret"
4. Example (Node.js)
Here's a basic Node.js example using Express to listen for SftpSync webhooks and verify their signatures.
1const express = require('express'); 2const crypto = require('crypto'); 3const app = express(); 4 5// Configure middleware to parse JSON 6app.use(express.json()); 7 8// Replace with your webhook secret from SftpSync dashboard 9const WEBHOOK_SECRET = 'your_webhook_secret'; 10 11// Verify webhook signature 12function verifySignature(requestPayload, signature) { 13 const hmac = crypto.createHmac('sha256', WEBHOOK_SECRET); 14 const digest = hmac.update(JSON.stringify(requestPayload)).digest('hex'); 15 return crypto.timingSafeEqual( 16 Buffer.from(digest), 17 Buffer.from(signature) 18 ); 19} 20 21// Webhook endpoint 22app.post('/webhooks/sftpsync', (req, res) => { 23 const signature = req.headers['x-sftp-sync-signature']; 24 const requestBody = req.body; 25 26 if (!signature || !verifySignature(requestBody, signature)) { 27 return res.status(401).send('Invalid signature'); 28 } 29 30 const event = requestBody.event; 31 32 switch (event) { 33 case 'GENERAL': { 34 // Use requestBody.data to access payload 35 const { filename, status } = requestBody.data; 36 console.log(`File ${filename} event received with status: ${status}`); 37 38 // Handle the event based on data properties 39 if (status === 'completed') { 40 // Handle successful processing 41 console.log(`File ${filename} processed successfully`); 42 } else if (status === 'failed') { 43 // Handle processing failure 44 const errorMessage = requestBody.data.errorMessage || 'Unknown error'; 45 console.log(`File ${filename} processing failed: ${errorMessage}`); 46 } 47 break; 48 } 49 default: 50 console.log(`Unknown event type: ${event}`); 51 } 52 53 res.status(200).send('Webhook received'); 54}); 55 56app.listen(3000, () => { 57 console.log('Webhook listener running on port 3000'); 58});
5. Typical Workflow
- Respond quickly (within 5 seconds) to avoid webhook timeouts
- Implement idempotency to handle potential duplicate webhook deliveries
- Use a queue system for processing webhook data asynchronously
- Store your webhook secret securely
- Implement proper error handling
Part 2: REST API
SftpSync provides a REST API endpoint to fetch processed JSON data from your files.
1. Retrieving Files by Pipeline Run ID
To retrieve the data for a specific pipeline run, use the /files/pipeline-runs/:id
endpoint. This is useful when you need to fetch data based on a specific processing job rather than searching by client and file names.
You can find pipeline run IDs in two ways:
- In the app: Navigate to the Pipeline Runs tab in your dashboard to view and copy pipeline run IDs for recent processing jobs
- Via webhook: Pipeline run IDs are included in webhook event payloads (see webhook section above)
Request Example:
curl -X GET "https://api.sftpsync.io/files/pipeline-runs/run_12345?offset=0&limit=10" -H "X-API-Key: your_api_key_here"
Path parameters:
id
(required): The pipeline run ID
Query parameters:
offset
(optional): Pagination offset (starting index)limit
(optional): Number of items to return per page
Response Example:
{
"data": [
{ "id": "123", "name": "Product Alpha", "email": "contact@example.com" },
{ "id": "124", "name": "Product Beta", "email": "support@example.com" }
],
"pagination": { "offset": 0, "limit": 10, "total": 2 }
}
All API requests require authentication using your API key, which you can find in the dashboard under My Account → Security Settings → API Key (click "Reveal").