Hadoop S3A Filesystem

References

https://www.cloudera.com/documentation/enterprise/latest/topics/spark_s3.html#spark_s3_examples
- QUOTE: The following example illustrates how to read a text file from Amazon S3 into an RDD, convert the RDD to a DataFrame, and then use the Data Source API to write the DataFrame into a Parquet file on Amazon S3:

scala> val sample_07 = sc.textFile("s3a://s3-to-ec2/sample_07.csv")

https://aws.amazon.com/premiumsupport/knowledge-center/emr-file-system-s3/
- QUOTE: S3A (URI scheme: s3a) - Hadoop’s successor to the S3N filesystem. S3A uses Amazon’s libraries to interact with S3. S3A supports accessing files larger than 5 GB, and it provides performance enhancements and other improvements. For Apache Hadoop, S3A is the successor to S3N and is backward compatible with S3N. Using Apache Hadoop, all objects accessible from s3n:// URLs should also be accessible from S3A by replacing the URL scheme.
  - Note: Amazon EMR does not currently support use of the Apache Hadoop S3A file system.