Hadoop S3A Filesystem
Jump to navigation
Jump to search
An Hadoop S3A Filesystem is a distributed filesystem based on ...
- …
- Counter-Example(s):
- See: S3, S3 REST API.
References
2017
scala> val sample_07 = sc.textFile("s3a://s3-to-ec2/sample_07.csv")
2016
- https://community.hortonworks.com/articles/36339/spark-s3a-filesystem-client-from-hdp-to-access-s3.html
- QUOTE: The S3A filesystem client (s3a://) is a replacement for the S3 Native (s3n://):
- It uses Amazon’s libraries to interact with S3.
- Supports larger files
- Higher performance
- Supports IAM role-based authentication.
- Production stable since Hadoop 2.7 (per Apache website)
- QUOTE: The S3A filesystem client (s3a://) is a replacement for the S3 Native (s3n://):
2016
- https://aws.amazon.com/premiumsupport/knowledge-center/emr-file-system-s3/
- QUOTE: S3A (URI scheme: s3a) - Hadoop’s successor to the S3N filesystem. S3A uses Amazon’s libraries to interact with S3. S3A supports accessing files larger than 5 GB, and it provides performance enhancements and other improvements. For Apache Hadoop, S3A is the successor to S3N and is backward compatible with S3N. Using Apache Hadoop, all objects accessible from s3n:// URLs should also be accessible from S3A by replacing the URL scheme.
- Note: Amazon EMR does not currently support use of the Apache Hadoop S3A file system.
- QUOTE: S3A (URI scheme: s3a) - Hadoop’s successor to the S3N filesystem. S3A uses Amazon’s libraries to interact with S3. S3A supports accessing files larger than 5 GB, and it provides performance enhancements and other improvements. For Apache Hadoop, S3A is the successor to S3N and is backward compatible with S3N. Using Apache Hadoop, all objects accessible from s3n:// URLs should also be accessible from S3A by replacing the URL scheme.