Skip to main content

S3: Simple Storage Service

S3: Simple Storage Service

Amazon Simple Storage Service (S3) is an object storage service that offers industry-leading scalability, data availability, security, and performance. S3 is designed to store and retrieve any amount of data from anywhere on the web, making it ideal for a wide range of use cases from websites to data lakes.

What is S3?

S3 is object storage built to store and retrieve any amount of data. Unlike traditional file systems, S3 stores data as objects within buckets. Each object consists of the data itself, metadata, and a unique identifier.

Key Concepts

Buckets

  • Container for Objects: A bucket is a container for storing objects
  • Globally Unique Names: Bucket names must be globally unique across all AWS accounts
  • Region-Specific: Buckets are created in a specific AWS region
  • Naming Rules: Lowercase letters, numbers, hyphens, and periods; 3-63 characters

Objects

  • Data + Metadata: An object consists of the data file and its metadata
  • Key: Unique identifier for the object within the bucket
  • Version: Each object can have multiple versions
  • Up to 5 TB: Maximum object size is 5 terabytes

Keys

  • Unique Identifier: The key is the unique identifier for an object within a bucket
  • Path-like Structure: Keys can include forward slashes to create a directory-like structure
  • Case Sensitive: Keys are case-sensitive

Storage Classes

S3 offers different storage classes optimized for different use cases and access patterns.

S3 Standard

  • General Purpose: Designed for frequently accessed data
  • 99.99% Availability: High durability and availability
  • Use Cases: Websites, content distribution, mobile apps, big data analytics

S3 Intelligent-Tiering

  • Automatic Cost Optimization: Automatically moves data to the most cost-effective access tier
  • No Retrieval Fees: No additional fees for automatic tiering
  • Use Cases: Unknown or changing access patterns

S3 Standard-IA (Infrequent Access)

  • Lower Cost: Lower storage price, but charges for retrieval
  • 99.9% Availability: Slightly lower availability than Standard
  • Use Cases: Backup data, disaster recovery, long-term storage

S3 One Zone-IA

  • Single AZ: Stores data in a single Availability Zone
  • Lowest Cost: Lowest cost option (20% less than Standard-IA)
  • Use Cases: Secondary backup copies, recreatable data

S3 Glacier Instant Retrieval

  • Millisecond Access: Archive storage with millisecond retrieval
  • Lowest Storage Cost: Lowest cost for rarely accessed data requiring immediate access
  • Use Cases: Medical images, news media assets, genomics data

S3 Glacier Flexible Retrieval

  • Three Retrieval Options: Expedited (1-5 minutes), Standard (3-5 hours), Bulk (5-12 hours)
  • Very Low Cost: Very low storage price
  • Use Cases: Backup, archives, disaster recovery

S3 Glacier Deep Archive

  • Lowest Cost: Lowest cost storage class
  • Long-term Archive: 12-hour retrieval time
  • Use Cases: Long-term compliance, digital preservation

Security

Access Control

  • Bucket Policies: JSON-based policies to control access at the bucket level
  • ACLs (Access Control Lists): Legacy method for fine-grained access control
  • IAM Policies: Use IAM to control access to S3 resources
  • Public Access Block: Prevent public access to buckets and objects

Encryption

  • Server-Side Encryption (SSE): Encrypt objects at rest
    • SSE-S3: Amazon S3 managed keys
    • SSE-KMS: AWS Key Management Service (KMS) managed keys
    • SSE-C: Customer-provided encryption keys
  • Client-Side Encryption: Encrypt data before uploading to S3
  • Encryption in Transit: Use HTTPS/TLS for data transfer

Access Points

  • Dedicated Access: Create access points with unique names and permissions
  • Shared Data Sets: Simplify shared data access for applications
  • Network Origin Control: Restrict access based on VPC or IP address

Versioning

S3 Versioning allows you to keep multiple versions of an object in the same bucket.

Benefits

  • Protect Against Accidental Deletion: Recover previous versions
  • Preserve, Retrieve, and Restore: Maintain a complete version history
  • MFA Delete: Require multi-factor authentication for permanent deletion

Lifecycle

  • Version Management: Automatically transition or expire old versions
  • Cost Optimization: Move old versions to cheaper storage classes

Lifecycle Management

S3 Lifecycle policies automate moving objects between storage classes or deleting them based on age.

Transition Actions

  • Move to IA: Transition to Infrequent Access after a period
  • Move to Glacier: Archive to Glacier storage classes
  • Delete: Automatically delete objects after expiration

Use Cases

  • Log Files: Transition logs to cheaper storage or delete after retention period
  • Backups: Move backups to Glacier for long-term storage
  • Compliance: Automate retention and deletion policies

Replication

S3 Replication automatically replicates objects across different S3 buckets.

Types

  • Cross-Region Replication (CRR): Replicate across AWS regions
  • Same-Region Replication (SRR): Replicate within the same region
  • Replication Time Control: Predictable replication within 15 minutes

Use Cases

  • Compliance: Meet compliance requirements for data replication
  • Lower Latency: Reduce latency by accessing data from different regions
  • Disaster Recovery: Backup data to different regions

Performance

Transfer Acceleration

  • Faster Uploads: Upload files to S3 faster using CloudFront edge locations
  • Global Network: Uses AWS global network infrastructure
  • Enable Per Bucket: Enable transfer acceleration per bucket

Multipart Upload

  • Large Objects: Upload large objects in parts
  • Parallel Upload: Upload parts in parallel for faster transfer
  • Resume Upload: Resume interrupted uploads
  • Required for >5 GB: Required for objects larger than 5 GB

Request Rate and Performance

  • Request Rates: S3 automatically scales to support very high request rates
  • Prefix Performance: Use random prefixes for better performance
  • CloudFront Integration: Use CloudFront for faster content delivery

Static Website Hosting

S3 can host static websites with automatic scaling and high availability.

Features

  • Static Content: HTML, CSS, JavaScript, images, and media files
  • Custom Domain: Use your own domain name with Route 53
  • HTTPS: Use CloudFront for HTTPS support
  • Index and Error Pages: Configure index and error documents

Event Notifications

S3 can send notifications when certain events happen in your bucket.

Event Types

  • s3:ObjectCreated: New object created
  • s3:ObjectRemoved: Object deleted
  • s3:ObjectRestore: Object restored from Glacier

Destinations

  • SNS Topics: Publish to Amazon SNS topics
  • SQS Queues: Send messages to Amazon SQS queues
  • Lambda Functions: Trigger AWS Lambda functions
  • EventBridge: Send to Amazon EventBridge

Best Practices

Security

  • Enable versioning for critical data
  • Use bucket policies and IAM policies for access control
  • Enable encryption at rest and in transit
  • Enable MFA delete for production buckets
  • Use Public Access Block to prevent accidental public access

Cost Optimization

  • Use appropriate storage classes for access patterns
  • Enable lifecycle policies to transition or delete old objects
  • Monitor storage usage with S3 Storage Lens
  • Use Intelligent-Tiering for unknown access patterns
  • Clean up incomplete multipart uploads

Performance

  • Use multipart upload for large objects
  • Enable Transfer Acceleration for global uploads
  • Use CloudFront for content delivery
  • Design key naming for even distribution

Reliability

  • Enable versioning for important data
  • Use cross-region replication for disaster recovery
  • Implement lifecycle policies for automated management
  • Monitor bucket metrics and set up alerts

By understanding S3's features and best practices, you can build scalable, secure, and cost-effective storage solutions for your applications. Always refer to AWS documentation for the latest features and pricing information.