Hadoop_java relationship with python

Hadoop_java relationship with python

Python is a dynamic language.

Hadoop is a distributed computing framework written in java.

They are two levels of things.

If you have to be connected, Python can apply the hadoop framework to do distributed computing development.

But language and framework can be assembled by themselves. Java can also use Hadoop to develop distributed computing.

Python can also use spark to develop distributed computing, you can match it according to your needs.

Hadoop_java relationship with python

The difference between java and python:

First, the python virtual machine is not strong, java virtual machine is the core of java, the core of python is very convenient to use c language function or c++ library.

Second, Python is fully dynamic, you can modify your own code at runtime, java can only be achieved through a workaround. Python variables are dynamic, and java variables are static and need to be declared beforehand, so java ide's code hinting is better than python ide.

Third, Python has been produced for decades. The process-oriented process is mainstream for decades. So there are many programs in Python that use process-oriented design methods. Many concepts come from c language. Class is added after python. And java is to achieve c++ without pointers (the reference number used by the com component in the past, the virtual machine used by java), mainly adopts the object-oriented design method, many concepts are the concept of oop. Process-oriented, relatively simple and intuitive, but easy to design a noodle program, object-oriented, relatively abstract and elegant, but easy to over-abstract.

Fourth, the actual use of Python is simple to start, but to learn to work with python, you need to learn python various libraries, pyhton's powerful lies in the library, why python's library is powerful, because python's library can use python, c language, Designs such as c++ are available to Python, so no matter what gpu runs, neural networks, intelligent algorithms, data analysis, image processing, scientific calculations, a variety of libraries are waiting for you. And java does not have as many open source libraries as python. Many libraries are used internally by commercial companies, or they are only released as a jar package, and the original code is not visible. The python virtual machine is not well supported by java (or deliberately designed) because of compileability. Generally, the source code (linux) is used directly, or the source code is simple to make a package (such as pyexe).

Five, Python has many virtual machine implementations, such as cython, Pyston, pypy, jython, IronPython, etc., suitable for business languages, or plug-in languages, or domain-oriented languages, and java is rarely used for plug-in languages ​​because of the huge virtual machine. It is not convenient to publish.

Sixth, java is mainly used in the field of strong business logic, such as the mall system, erp, oa, finance, insurance and other traditional database transaction areas, through the similar ssh framework transaction code, support for commercial databases, such as oralce, db2, sql server, etc. Well, the software engineering concept is strong, suitable for software engineering multi-person development mode. Python is mainly used for web data analysis, scientific calculation, financial analysis, signal analysis, image algorithm, mathematical calculation, statistical analysis, algorithm modeling, server operation and maintenance, automation operation, strong development concept, suitable for rapid development team or individual agile mode. .

Seven, java commercial companies support more, such as sap, oracle, ibm, etc., there are commercial containers, middleware, enterprise framework ejb. Python's open source organization supports many, such as qt, linux, google, many open source programs support python, such as pyqt, redis, spark and so on.

Eight, python is the most used script, java is the most used web, pyhotn is glue, you can stick all kinds of irrelevant things together, java is the foundation, you can form a team of hundreds of people through software engineering and you Pk, commercialized atmosphere. However, I think it is still python powerful, because it can easily call the library of c or c++, but the software engineering and commercial operation are not good for java, suitable for rapid development.

Nine, about money.

If you want to write a program to sell software with java, you can use ibm server, oracle database, EMC storage, high price, commercial procurement company like this tall. If you want to use the program to generate money directly in python, Python can achieve wide-client financial, data backtesting, stock trading, speculation, speculation, gold, speculative bitcoin, hedge arbitrage, statistical arbitrage, there are many open source libraries, data analysis libraries, machines The learning library can be referenced.

Ten, java and python, can run on the Linux operating system, but many linux can support python natively, java needs to install itself. The reason that java and python are stronger than c# is greater than support for linux, support for osx, support for Unix, and support for arm. The reason java and python are more popular than c++ is that no pointers are needed.

11. For mobile internet, Python can only run on Android or ios through the runtime library. Java natively supports Android development, but not in ios.

Twelve, for big data, hadoop is opened with java, spark is developed with Scala, and it is more convenient to call spark with python.

Hadoop_java relationship with python

Popular things are the following

1. Python is simpler than Java, with low learning cost and high development efficiency

2. Java runs more efficiently than Python, especially pure Python-developed programs, which are extremely inefficient.

3. There are many Java related materials, especially Chinese materials.

4. Java version is relatively stable, Python 2 and 3 are not compatible, resulting in a large number of class library invalidation

5. Java development is biased towards software engineering, team collaboration, Python is more suitable for small development

6.Java is biased towards commercial development, Python is suitable for data analysis

7.Java is a statically typed language, Python is a dynamically typed language

8. All variables in Java need to be declared (type) before they can be used. Variables in Python do not need to declare types.

9. Java can be run after compiling, Python can run directly;

10. The blocks in JAVA are enclosed in braces, and Python is indented with a colon + four spaces.

11. The type of JAVA is declared, the type of Python is not required.

12.JAVA Each line of statements ends with a semicolon, and Python can not write a semicolon.

13. When implementing the same function, JAVA usually has more keyboards than Python.

Some details differ:

Number

Python has only four kinds of data: integers, long integers, floating point numbers, and complex numbers.

Java has char, short, byte, int, long, float, double types

2. String

2.1. String representation

There is no char type in Python that represents a single constant string type. It can be represented by a single quote ' ' or double quotes ' ', or a triplet can be used to represent a multi-line string.

In Java, char represents a single character, String represents a string, and constant characters or strings are represented by double quotation marks " ".

2.2. Multi-line string

Python adds a backslash (/) to the end of the string to indicate that the string continues on the next line.

Java uses a plus sign (+) to indicate that the string continues on the next line.

2.3. Other representations in Python

In Python, you can prefix the string with r or R: it means a natural string, that is, it is easier to transfer the string than Java.

Python can be prefixed with u or U: for unicode strings

Note: The __init__() method in Python is similar to the constructor in Java. Self in the Java constructor exists by default. It does not need to be specified when the constructor is declared, but Python needs to be displayed in the __init__() function. Indicate (but the ID is not called for self-delivery).

3. Operator

** in Python represents power calculations, if X**y represents Xy

In Python, // represents integer division, that is, the integer part of the quotient.

In Python, ~ means bitwise flipping, ~x is -(x+1)

4. Serialized representation of the object

Python can use the str () or repr () function to achieve serialization of objects

Java serialization through the toString() method

Introduction to Hadoop

Hadoop is a distributed system infrastructure developed by the Apache Foundation.

Users can develop distributed programs without knowing the underlying details of the distribution. Take full advantage of the power of the cluster for high-speed computing and storage.

Hadoop implements a distributed file system (Hadoop Distributed File System), referred to as HDFS. HDFS is highly fault-tolerant and designed to be deployed on low-cost hardware; it also provides high throughput to access application data for large data sets (large data) Set) application. HDFS relaxes the requirements of POSIX and can stream access data in the file system.

The core design of Hadoop's framework is HDFS and MapReduce. HDFS provides storage for massive amounts of data, and MapReduce provides calculations for massive amounts of data.

Introduction to Java

Java is an object-oriented programming language. It not only absorbs the various advantages of the C++ language, but also eliminates the concepts of multiple inheritance and pointers that are difficult to understand in C++. Therefore, the Java language is powerful and easy to use. As a representative of static object-oriented programming languages, the Java language implements object-oriented theory very well, allowing programmers to perform complex programming in an elegant way.

Java is characterized by simplicity, object-oriented, distributed, robustness, security, platform independence and portability, multi-threading, and dynamics. Java can write desktop applications, web applications, distributed systems, and embedded system applications.

Introduction to Python

An object-oriented interpreted computer programming language, invented by the Dutch Guido van Rossum in 1989, the first public release was published in 1991.

Python is purely free software, and the source code and interpreter CPython follows the GPL (GNU General Public License) protocol. The Python syntax is simple and clear, and one of its features is to force white space as a statement indentation.

Fields where Python fits:

1. Web sites and various web services;

2. System tools and scripts;

3. As a "glue" language, modules developed in other languages ​​are packaged for easy use;

Python vs. other languages:

1. C is compiled into machine code, which runs very fast and has a lot of code;

2. Java is compiled into bytecode, which runs fast and has a large amount of code;

3. Python interprets execution, runs slower, and has less code;

Python basic syntax:

Unlike java, there is no need to add a { } definition to determine a block of code. Python has strict control over code indentation and can basically determine the block by indentation.

About variables:

1. Definition: no need to declare the type, and must be assigned;

2. Scope of use: Add two underscores before the variable, such as: __content = "haha" means that the variable is private, and if it is not added, it is public by default.

Hadoop_java relationship with python

FTTH Cable

Ftth Drop Cable,Cable Ftth,Ftth Fiber Cable,Ftth Fiber Optic Cable

Zhejiang Wanma Tianyi Communication Wire & Cable Co., Ltd. , https://www.zjwmty.com

Posted on