这篇文章主要介绍“springboot集成spark并使用spark-sql的方法”的相关知识,小编通过实际案例向大家展示操作过程,操作方法简单快捷,实用性强,希望这篇“springboot集成spark并使用spark-sql的方法”文章能帮助大家解决问题。

首先添加相关依赖:

<?xmlversion="1.0"encoding="UTF-8"?><projectxmlns="http://maven.apache.org/POM/4.0.0"xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"xsi:schemaLocation="http://maven.apache.org/POM/4.0.0http://maven.apache.org/xsd/maven-4.0.0.xsd"><modelVersion>4.0.0</modelVersion><parent><groupId>org.springframework.boot</groupId><artifactId>spring-boot-starter-parent</artifactId><version>1.5.6.RELEASE</version><relativePath/></parent><groupId>com.cord</groupId><artifactId>spark-example</artifactId><version>1.0-SNAPSHOT</version><name>spark-example</name><!--FIXMEchangeittotheproject'swebsite--><url>http://www.example.com</url><properties><project.build.sourceEncoding>UTF-8</project.build.sourceEncoding><project.reporting.outputEncoding>UTF-8</project.reporting.outputEncoding><java.version>1.8</java.version><scala.version>2.10.3</scala.version><maven.compiler.source>1.8</maven.compiler.source><maven.compiler.target>1.8</maven.compiler.target></properties><dependencies><dependency><groupId>org.springframework.boot</groupId><artifactId>spring-boot-starter</artifactId><version>1.5.6.RELEASE</version><exclusions><exclusion><groupId>org.springframework.boot</groupId><artifactId>spring-boot-starter-logging</artifactId></exclusion></exclusions></dependency><dependency><groupId>org.apache.spark</groupId><artifactId>spark-core_2.10</artifactId><version>1.6.1</version><scope>provided</scope><exclusions><exclusion><groupId>org.slf4j</groupId><artifactId>slf4j-log4j12</artifactId></exclusion><exclusion><groupId>log4j</groupId><artifactId>log4j</artifactId></exclusion></exclusions></dependency><dependency><groupId>org.apache.spark</groupId><artifactId>spark-sql_2.10</artifactId><version>1.6.1</version><scope>provided</scope></dependency><dependency><groupId>org.apache.spark</groupId><artifactId>spark-hive_2.10</artifactId><version>1.6.1</version><scope>provided</scope></dependency><dependency><groupId>org.scala-lang</groupId><artifactId>scala-library</artifactId><version>${scala.version}</version><scope>provided</scope></dependency><!--yarn-cluster模式--><dependency><groupId>mysql</groupId><artifactId>mysql-connector-java</artifactId><version>5.1.22</version></dependency></dependencies><build><plugins><plugin><groupId>org.apache.maven.plugins</groupId><artifactId>maven-shade-plugin</artifactId><dependencies><dependency><groupId>org.springframework.boot</groupId><artifactId>spring-boot-maven-plugin</artifactId><version>1.5.6.RELEASE</version></dependency></dependencies><configuration><keepDependenciesWithProvidedScope>false</keepDependenciesWithProvidedScope><createDependencyReducedPom>false</createDependencyReducedPom><filters><filter><artifact>*:*</artifact><excludes><exclude>META-INF/*.SF</exclude><exclude>META-INF/*.DSA</exclude><exclude>META-INF/*.RSA</exclude></excludes></filter></filters><transformers><transformerimplementation="org.apache.maven.plugins.shade.resource.AppendingTransformer"><resource>META-INF/spring.handlers</resource></transformer><transformerimplementation="org.springframework.boot.maven.PropertiesMergingResourceTransformer"><resource>META-INF/spring.factories</resource></transformer><transformerimplementation="org.apache.maven.plugins.shade.resource.AppendingTransformer"><resource>META-INF/spring.schemas</resource></transformer><transformerimplementation="org.apache.maven.plugins.shade.resource.ServicesResourceTransformer"/><transformerimplementation="org.apache.maven.plugins.shade.resource.ManifestResourceTransformer"><mainClass>com.cord.StartApplication</mainClass></transformer></transformers></configuration><executions><execution><phase>package</phase><goals><goal>shade</goal></goals></execution></executions></plugin></plugins></build></project>

需要注意的是依赖中排除掉的日志模块,以及特殊的打包方式

定义配置类:

SparkContextBean.class

@ConfigurationpublicclassSparkContextBean{privateStringappName="sparkExp";privateStringmaster="local";@Bean@ConditionalOnMissingBean(SparkConf.class)publicSparkConfsparkConf()throwsException{SparkConfconf=newSparkConf().setAppName(appName).setMaster(master);returnconf;}@Bean@ConditionalOnMissingBeanpublicJavaSparkContextjavaSparkContext()throwsException{returnnewJavaSparkContext(sparkConf());}@Bean@ConditionalOnMissingBeanpublicHiveContexthiveContext()throwsException{returnnewHiveContext(javaSparkContext());}......}

启动类:

StartApplication.class

@SpringBootApplicationpublicclassStartApplicationimplementsCommandLineRunner{@AutowiredprivateHiveContexthc;publicstaticvoidmain(String[]args){SpringApplication.run(StartApplication.class,args);}@Overridepublicvoidrun(String...args)throwsException{DataFramedf=hc.sql("selectcount(1)fromLCS_DB.STAFF_INFO");List<Long>result=df.javaRDD().map((Function<Row,Long>)row->{returnrow.getLong(0);}).collect();result.stream().forEach(System.out::println);}

执行方式:

spark-submit\--classcom.cord.StartApplication\--executor-memory4G\--num-executors8\--masteryarn-client\/data/cord/spark-example-1.0-SNAPSHOT.jar

关于“springboot集成spark并使用spark-sql的方法”的内容就介绍到这里了,感谢大家的阅读。如果想了解更多行业相关的知识,可以关注亿速云行业资讯频道,小编每天都会为大家更新不同的知识点。