Skip to content

Commit 77a75cc

Browse files
ahmarsuhailsteveloughran
authored andcommitted
HADOOP-18073. Upgrade AWS SDK to v2 in S3A
This is an aggregate patch of the changes from feature-HADOOP-18073-s3a-sdk-upgrade and moves the S3A connector to to using the V2 AWS SDK This is a major change: See aws_sdk_v2_changelog.md for details. A new shaded v2 SDK JAR "bundle.jar" needs to be distributed with the connector to interact with S3 stores All code which was using the V1 SDK classes with the S3AFileSystem will need upgrading. Contributed by Ahmar Suhail HADOOP-18820. Cut AWS v1 support (#5872) This removes the AWS V1 SDK as a hadoop-aws runtime dependency. It is still used at compile time so as to build a wrapper class V1ToV2AwsCredentialProviderAdapter which allows v1 credential provider to be used for authentication. All well known credential providers have their classname remapped from v1 to v2 classes prior to instantiation; this wrapper is not needed for them. There is no support for migrating other SDK plugin points (signing, handlers) Access to the v2 S3Client class used by an S3A FileSystem instance is now via a new interface org.apache.hadoop.fs.s3a.S3AInternals; other low-level operations (getObjectMetadata(Path)) have moved. Contributed by Steve Loughran HADOOP-18853. Upgrade AWS SDK version to 2.20.28 (#5960) Upgrades the AWS sdk v2 version to 2.20.28 This * adds multipart COPY/rename in the java async client * removes the aws-crt JAR dependency Contributed by Ahmar Suhail HADOOP-18818. Merge aws v2 upgrade feature branch into trunk Contains HADOOP-18863. AWS SDK V2 - AuditFailureExceptions aren't being translated properly Change-Id: I96b26cc1ee535c519248ca6541fb157017dcc7e4
1 parent 01cc6d0 commit 77a75cc

205 files changed

Lines changed: 8988 additions & 5898 deletions

File tree

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

LICENSE-binary

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -364,6 +364,7 @@ org.objenesis:objenesis:2.6
364364
org.xerial.snappy:snappy-java:1.1.10.1
365365
org.yaml:snakeyaml:2.0
366366
org.wildfly.openssl:wildfly-openssl:1.1.3.Final
367+
software.amazon.awssdk:bundle:jar:2.20.128
367368

368369

369370
--------------------------------------------------------------------------------

hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/statistics/StoreStatisticNames.java

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -407,6 +407,10 @@ public final class StoreStatisticNames {
407407
public static final String MULTIPART_UPLOAD_LIST
408408
= "multipart_upload_list";
409409

410+
/** Probe for store region: {@value}. */
411+
public static final String STORE_REGION_PROBE
412+
= "store_region_probe";
413+
410414
private StoreStatisticNames() {
411415
}
412416

hadoop-common-project/hadoop-common/src/main/resources/core-default.xml

Lines changed: 11 additions & 41 deletions
Original file line numberDiff line numberDiff line change
@@ -1387,61 +1387,31 @@
13871387
<description>AWS secret key used by S3A file system. Omit for IAM role-based or provider-based authentication.</description>
13881388
</property>
13891389

1390+
<property>
1391+
<name>fs.s3a.session.token</name>
1392+
<description>Session token, when using org.apache.hadoop.fs.s3a.TemporaryAWSCredentialsProvider
1393+
as one of the providers.
1394+
</description>
1395+
</property>
1396+
13901397
<property>
13911398
<name>fs.s3a.aws.credentials.provider</name>
13921399
<value>
13931400
org.apache.hadoop.fs.s3a.TemporaryAWSCredentialsProvider,
13941401
org.apache.hadoop.fs.s3a.SimpleAWSCredentialsProvider,
1395-
com.amazonaws.auth.EnvironmentVariableCredentialsProvider,
1402+
software.amazon.awssdk.auth.credentials.EnvironmentVariableCredentialsProvider,
13961403
org.apache.hadoop.fs.s3a.auth.IAMInstanceCredentialsProvider
13971404
</value>
13981405
<description>
13991406
Comma-separated class names of credential provider classes which implement
1400-
com.amazonaws.auth.AWSCredentialsProvider.
1407+
software.amazon.awssdk.auth.credentials.AwsCredentialsProvider.
14011408

14021409
When S3A delegation tokens are not enabled, this list will be used
14031410
to directly authenticate with S3 and other AWS services.
14041411
When S3A Delegation tokens are enabled, depending upon the delegation
14051412
token binding it may be used
14061413
to communicate wih the STS endpoint to request session/role
14071414
credentials.
1408-
1409-
These are loaded and queried in sequence for a valid set of credentials.
1410-
Each listed class must implement one of the following means of
1411-
construction, which are attempted in order:
1412-
* a public constructor accepting java.net.URI and
1413-
org.apache.hadoop.conf.Configuration,
1414-
* a public constructor accepting org.apache.hadoop.conf.Configuration,
1415-
* a public static method named getInstance that accepts no
1416-
arguments and returns an instance of
1417-
com.amazonaws.auth.AWSCredentialsProvider, or
1418-
* a public default constructor.
1419-
1420-
Specifying org.apache.hadoop.fs.s3a.AnonymousAWSCredentialsProvider allows
1421-
anonymous access to a publicly accessible S3 bucket without any credentials.
1422-
Please note that allowing anonymous access to an S3 bucket compromises
1423-
security and therefore is unsuitable for most use cases. It can be useful
1424-
for accessing public data sets without requiring AWS credentials.
1425-
1426-
If unspecified, then the default list of credential provider classes,
1427-
queried in sequence, is:
1428-
* org.apache.hadoop.fs.s3a.TemporaryAWSCredentialsProvider: looks
1429-
for session login secrets in the Hadoop configuration.
1430-
* org.apache.hadoop.fs.s3a.SimpleAWSCredentialsProvider:
1431-
Uses the values of fs.s3a.access.key and fs.s3a.secret.key.
1432-
* com.amazonaws.auth.EnvironmentVariableCredentialsProvider: supports
1433-
configuration of AWS access key ID and secret access key in
1434-
environment variables named AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY,
1435-
and AWS_SESSION_TOKEN as documented in the AWS SDK.
1436-
* org.apache.hadoop.fs.s3a.auth.IAMInstanceCredentialsProvider: picks up
1437-
IAM credentials of any EC2 VM or AWS container in which the process is running.
1438-
</description>
1439-
</property>
1440-
1441-
<property>
1442-
<name>fs.s3a.session.token</name>
1443-
<description>Session token, when using org.apache.hadoop.fs.s3a.TemporaryAWSCredentialsProvider
1444-
as one of the providers.
14451415
</description>
14461416
</property>
14471417

@@ -1539,10 +1509,10 @@
15391509
Note: for job submission to actually collect these tokens,
15401510
Kerberos must be enabled.
15411511

1542-
Options are:
1512+
Bindings available in hadoop-aws are:
15431513
org.apache.hadoop.fs.s3a.auth.delegation.SessionTokenBinding
15441514
org.apache.hadoop.fs.s3a.auth.delegation.FullCredentialsTokenBinding
1545-
and org.apache.hadoop.fs.s3a.auth.delegation.RoleTokenBinding
1515+
org.apache.hadoop.fs.s3a.auth.delegation.RoleTokenBinding
15461516
</description>
15471517
</property>
15481518

hadoop-project/pom.xml

Lines changed: 20 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -184,6 +184,8 @@
184184
<surefire.fork.timeout>900</surefire.fork.timeout>
185185
<aws-java-sdk.version>1.12.499</aws-java-sdk.version>
186186
<hsqldb.version>2.7.1</hsqldb.version>
187+
<aws-java-sdk-v2.version>2.20.128</aws-java-sdk-v2.version>
188+
<aws.eventstream.version>1.0.1</aws.eventstream.version>
187189
<frontend-maven-plugin.version>1.11.2</frontend-maven-plugin.version>
188190
<jasmine-maven-plugin.version>2.1</jasmine-maven-plugin.version>
189191
<phantomjs-maven-plugin.version>0.7</phantomjs-maven-plugin.version>
@@ -1128,15 +1130,31 @@
11281130
</dependency>
11291131
<dependency>
11301132
<groupId>com.amazonaws</groupId>
1131-
<artifactId>aws-java-sdk-bundle</artifactId>
1133+
<artifactId>aws-java-sdk-core</artifactId>
11321134
<version>${aws-java-sdk.version}</version>
11331135
<exclusions>
11341136
<exclusion>
1135-
<groupId>io.netty</groupId>
1137+
<groupId>*</groupId>
1138+
<artifactId>*</artifactId>
1139+
</exclusion>
1140+
</exclusions>
1141+
</dependency>
1142+
<dependency>
1143+
<groupId>software.amazon.awssdk</groupId>
1144+
<artifactId>bundle</artifactId>
1145+
<version>${aws-java-sdk-v2.version}</version>
1146+
<exclusions>
1147+
<exclusion>
1148+
<groupId>*</groupId>
11361149
<artifactId>*</artifactId>
11371150
</exclusion>
11381151
</exclusions>
11391152
</dependency>
1153+
<dependency>
1154+
<groupId>software.amazon.eventstream</groupId>
1155+
<artifactId>eventstream</artifactId>
1156+
<version>${aws.eventstream.version}</version>
1157+
</dependency>
11401158
<dependency>
11411159
<groupId>org.apache.mina</groupId>
11421160
<artifactId>mina-core</artifactId>

hadoop-tools/hadoop-aws/dev-support/findbugs-exclude.xml

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -64,6 +64,11 @@
6464
<Field name="futurePool"/>
6565
<Bug pattern="IS2_INCONSISTENT_SYNC"/>
6666
</Match>
67+
<Match>
68+
<Class name="org.apache.hadoop.fs.s3a.S3AFileSystem"/>
69+
<Field name="s3AsyncClient"/>
70+
<Bug pattern="IS2_INCONSISTENT_SYNC"/>
71+
</Match>
6772
<Match>
6873
<Class name="org.apache.hadoop.fs.s3a.s3guard.S3GuardTool$BucketInfo"/>
6974
<Method name="run"/>

hadoop-tools/hadoop-aws/pom.xml

Lines changed: 15 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -494,11 +494,25 @@
494494
<scope>test</scope>
495495
<type>test-jar</type>
496496
</dependency>
497+
498+
<!-- The v1 SDK is used at compilation time for adapter classes in
499+
org.apache.hadoop.fs.s3a.adapter. It is not needed at runtime
500+
unless a non-standard v1 credential provider is declared. -->
497501
<dependency>
498502
<groupId>com.amazonaws</groupId>
499-
<artifactId>aws-java-sdk-bundle</artifactId>
503+
<artifactId>aws-java-sdk-core</artifactId>
504+
<scope>provided</scope>
505+
</dependency>
506+
<dependency>
507+
<groupId>software.amazon.awssdk</groupId>
508+
<artifactId>bundle</artifactId>
500509
<scope>compile</scope>
501510
</dependency>
511+
<dependency>
512+
<groupId>software.amazon.eventstream</groupId>
513+
<artifactId>eventstream</artifactId>
514+
<scope>test</scope>
515+
</dependency>
502516
<dependency>
503517
<groupId>org.assertj</groupId>
504518
<artifactId>assertj-core</artifactId>

hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/AWSBadRequestException.java

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -18,7 +18,7 @@
1818

1919
package org.apache.hadoop.fs.s3a;
2020

21-
import com.amazonaws.AmazonServiceException;
21+
import software.amazon.awssdk.awscore.exception.AwsServiceException;
2222

2323
/**
2424
* A 400 "Bad Request" exception was received.
@@ -36,7 +36,7 @@ public class AWSBadRequestException extends AWSServiceIOException {
3636
* @param cause the underlying cause
3737
*/
3838
public AWSBadRequestException(String operation,
39-
AmazonServiceException cause) {
39+
AwsServiceException cause) {
4040
super(operation, cause);
4141
}
4242
}

hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/AWSClientIOException.java

Lines changed: 12 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -18,34 +18,40 @@
1818

1919
package org.apache.hadoop.fs.s3a;
2020

21-
import com.amazonaws.AmazonClientException;
22-
import com.amazonaws.SdkBaseException;
21+
import software.amazon.awssdk.core.exception.SdkException;
2322
import org.apache.hadoop.util.Preconditions;
2423

2524
import java.io.IOException;
2625

2726
/**
28-
* IOException equivalent of an {@link AmazonClientException}.
27+
* IOException equivalent of an {@link SdkException}.
2928
*/
3029
public class AWSClientIOException extends IOException {
3130

3231
private final String operation;
3332

3433
public AWSClientIOException(String operation,
35-
SdkBaseException cause) {
34+
SdkException cause) {
3635
super(cause);
3736
Preconditions.checkArgument(operation != null, "Null 'operation' argument");
3837
Preconditions.checkArgument(cause != null, "Null 'cause' argument");
3938
this.operation = operation;
4039
}
4140

42-
public AmazonClientException getCause() {
43-
return (AmazonClientException) super.getCause();
41+
public SdkException getCause() {
42+
return (SdkException) super.getCause();
4443
}
4544

4645
@Override
4746
public String getMessage() {
4847
return operation + ": " + getCause().getMessage();
4948
}
5049

50+
/**
51+
* Query inner cause for retryability.
52+
* @return what the cause says.
53+
*/
54+
public boolean retryable() {
55+
return getCause().retryable();
56+
}
5157
}

0 commit comments

Comments
 (0)