Install Airflow in Ubuntu 20.04 (Aliyun ECS)
Step 1: Check pip3 on Linux
root@SecondaryDataUse:~# pwd
/root
root@SecondaryDataUse:~# pip3 --version
pip 20.0.2 from /usr/lib/python3/dist-packages/pip (python 3.8)
Step 2: Install airflow using pip3
root@SecondaryDataUse:~# export AIRFLOW_HOME=~/airflow
root@SecondaryDataUse:~# pip3 install apache-airflow
Step 3: init airflow DB
root@SecondaryDataUse:~# airflow db init
DB: sqlite:root/airflow/airflow.db
[2022-01-11 10:02:47,931] {db.py:921} INFO - Creating tables
INFO [alembic.runtime.migration] Context impl SQLiteImpl.
INFO [alembic.runtime.migration] Will assume non-transactional DDL.
WARNI [airflow.models.crypto] empty cryptography key - values will not be stored encrypted.
WARNI [unusual_prefix_105d5a3ee2d83a45d12269415995c8d67312bae5_example_kubernetes_executor] The example_kubernetes_executor example DAG requires the kubernetes provider. Please install it with: pip install apache-airflow[cncf.kubernetes]
Initialization done
root@SecondaryDataUse:~#
Step 4: Start the web server:
root@SecondaryDataUse:~# airflow webserver -p 8081
____________ _____________
____ |__( )_________ __/__ /________ __
____ /| |_ /__ ___/_ /_ __ /_ __ \_ | /| / /
___ ___ | / _ / _ __/ _ / / /_/ /_ |/ |/ /
_/_/ |_/_/ /_/ /_/ /_/ \____/____/|__/
[2022-01-11 10:04:22,938] {dagbag.py:500} INFO - Filling up the DagBag from /dev/null
[2022-01-11 10:04:23,421] {manager.py:512} WARNING - Refused to delete permission view, assoc with role exists DAG Runs.can_create A dmin
Running the Gunicorn Server with:
Workers: 4 sync
Host: 0.0.0.0:8081
Timeout: 120
Logfiles: - -
Access Logformat:
=================================================================
[2022-01-11 10:04:25 +0800] [18764] [INFO] Starting gunicorn 20.1.0
[2022-01-11 10:04:25 +0800] [18764] [INFO] Listening at: http://0.0.0.0:8081 (18764)
[2022-01-11 10:04:25 +0800] [18764] [INFO] Using worker: sync
[2022-01-11 10:04:25 +0800] [18767] [INFO] Booting worker with pid: 18767
[2022-01-11 10:04:25 +0800] [18768] [INFO] Booting worker with pid: 18768
[2022-01-11 10:04:26 +0800] [18769] [INFO] Booting worker with pid: 18769
[2022-01-11 10:04:26 +0800] [18770] [INFO] Booting worker with pid: 18770
[2022-01-11 10:04:28,146] {manager.py:512} WARNING - Refused to delete permission view, assoc with role exists DAG Runs.can_create A dmin
[2022-01-11 10:04:28,206] {manager.py:512} WARNING - Refused to delete permission view, assoc with role exists DAG Runs.can_create A dmin
[2022-01-11 10:04:28,484] {manager.py:512} WARNING - Refused to delete permission view, assoc with role exists DAG Runs.can_create A dmin
[2022-01-11 10:04:28,573] {manager.py:512} WARNING - Refused to delete permission view, assoc with role exists DAG Runs.can_create A dmin
[2022-01-11 10:04:34 +0800] [18764] [INFO] Handling signal: winch
Note: Open the port 8081 in Aliyun ECS server in the web console.
Step 5: Start the airflow scheduler
root@SecondaryDataUse:~# airflow scheduler
____________ _____________
____ |__( )_________ __/__ /________ __
____ /| |_ /__ ___/_ /_ __ /_ __ \_ | /| / /
___ ___ | / _ / _ __/ _ / / /_/ /_ |/ |/ /
_/_/ |_/_/ /_/ /_/ /_/ \____/____/|__/
[2022-01-11 10:08:27 +0800] [18872] [INFO] Starting gunicorn 20.1.0
[2022-01-11 10:08:27,667] {scheduler_job.py:596} INFO - Starting the scheduler
[2022-01-11 10:08:27,668] {scheduler_job.py:601} INFO - Processing each file at most -1 times
[2022-01-11 10:08:27 +0800] [18872] [INFO] Listening at: http://0.0.0.0:8793 (18872)
[2022-01-11 10:08:27 +0800] [18872] [INFO] Using worker: sync
[2022-01-11 10:08:27 +0800] [18873] [INFO] Booting worker with pid: 18873
[2022-01-11 10:08:27,675] {manager.py:163} INFO - Launched DagFileProcessorManager with pid: 18874
[2022-01-11 10:08:27,677] {scheduler_job.py:1114} INFO - Resetting orphaned tasks for active dag runs
[2022-01-11 10:08:27,685] {settings.py:52} INFO - Configured default timezone Timezone('UTC')
[2022-01-11 10:08:27,688] {scheduler_job.py:1137} INFO - Marked 1 SchedulerJob instances as failed
[2022-01-11 10:08:27,698] {manager.py:441} WARNING - Because we cannot use more than 1 thread (parsing_processes = 2 ) when using sqlite. So we set parallelism to 1.
[2022-01-11 10:08:27 +0800] [18876] [INFO] Booting worker with pid: 18876
[2022-01-11 10:08:34 +0800] [18872] [INFO] Handling signal: winch
Step 6: open the browser:
URL of Airflow: http://yourip:8081/