v0.1.1 #16
Replies: 8 comments 13 replies
-
Hi. With the latest version,
|
Beta Was this translation helpful? Give feedback.
-
Hi Emre,
Did you run docker-compose.init.yml after updating to the latest version?
There are new fields that have been added to the Config table; you'll need
to delete the content under storage folder - run the init scripts and then
run the normal docker compose.
…On Mon, 7 Mar 2022 at 22:02, Emre Baykal ***@***.***> wrote:
I did uncomment all Hadoop related containers and restarted services, but
it's the same. This time, it might be related to resourcemanager. Adding
logs below.
nodemanager | [75/100] resourcemanager:8088 is not available yet
nodemanager | [75/100] try in 5s once again ...
historyserver | [75/100] check for resourcemanager:8088...
historyserver | [75/100] resourcemanager:8088 is not available yet
historyserver | [75/100] try in 5s once again ...
How can I run statefun-functions without adding Hadoop containers? I am
still testing pii-detection functionality and I haven't been able to test
since I fetched & merged latest update. Thanks.
—
Reply to this email directly, view it on GitHub
<#16 (reply in thread)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AB7GDS37GGDY6S2ORALVD2DU6YVRLANCNFSM5QCGPFQA>
.
Triage notifications on the go with GitHub Mobile for iOS
<https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675>
or Android
<https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.
You are receiving this because you are subscribed to this thread.Message
ID: ***@***.***
com>
|
Beta Was this translation helpful? Give feedback.
-
Are you on a Unix/Linux variant? Could you please share your
docker-compose.yml? Apart from HDFS and statefun-functions - any other
services that have not come up?
…On Mon, 7 Mar 2022 at 23:47, Emre Baykal ***@***.***> wrote:
Hi. Yes I did. Same result. All dependencies configured for
statefun-functions in docker-compose.yml are provided - rtdl-db, redpanda,
dremio services are healthy and running. Logs below (as I mentioned
earlier) might be interesting though. This goes forever:
nodemanager | [17/100] check for namenode:9000...
nodemanager | [17/100] namenode:9000 is not available yet
nodemanager | [17/100] try in 5s once again ...
resourcemanager | [17/100] check for namenode:9000...
resourcemanager | [17/100] namenode:9000 is not available yet
resourcemanager | [17/100] try in 5s once again ...
datanode | [17/100] check for namenode:9870...
datanode | [17/100] namenode:9870 is not available yet
datanode | [17/100] try in 5s once again ...
historyserver | [17/100] check for namenode:9000...
historyserver | [17/100] namenode:9000 is not available yet
historyserver | [17/100] try in 5s once again ...
—
Reply to this email directly, view it on GitHub
<#16 (reply in thread)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AB7GDS7AZJBLYOUZAEGX6WTU6ZB2HANCNFSM5QCGPFQA>
.
Triage notifications on the go with GitHub Mobile for iOS
<https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675>
or Android
<https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.
You are receiving this because you commented.Message ID:
***@***.***>
|
Beta Was this translation helpful? Give feedback.
-
Are you running Docker in sudo mode or did you do setfacl?
Also - apart from HDFS and statefun-functions - any other services that are
not coming up?
Statefun-functions does not have any dependence on HDFS - so it should come
up either way.
It takes about a minute to come up though - have you waited for that
duration?
…On Tue, 8 Mar 2022 at 00:14, Emre Baykal ***@***.***> wrote:
I am running Docker on MacOS - M1 Pro arch (amd64). Previous rtdl version
was working without any issue. I temporarily removed pii services to see if
the system works properly.
docker-compose.yml
version: '3'
services:
##### Config Services - Start #####
### For YBDB - Start ###
# rtdl-db:
# platform: linux/amd64
# image: yugabytedb/yugabyte:latest
# container_name: rtdl_rtdl-db
# volumes:
# - ./storage/rtdl-db_data:/home/yugabyte/yb_data
# command: ["bin/yugabyted", "start", "--base_dir=/home/yugabyte/yb_data", "--daemon=false"]
# expose:
# - 5433
# # ports:
# # - 5433:5432
# healthcheck:
# test: ["CMD", "yb-ts-cli", "--server_address=localhost", "is_server_ready"]
# interval: 10s
# timeout: 5s
# retries: 12
# config:
# # build:
# # context: ./config
# platform: linux/amd64
# image: rtdl/rtdl-config:latest
# container_name: rtdl_config
# expose:
# - 80
# ports:
# - 80:80
# environment:
# RTDL_DB_PORT: 5433
# depends_on:
# rtdl-db:
# condition: service_healthy
### For YBDB - End ###
### For Postgres - Start ###
rtdl-db:
platform: linux/amd64
image: postgres:latest
container_name: rtdl_rtdl-db
volumes:
- ./storage/rtdl-db_data:/var/lib/postgresql/data
expose:
- 5432
# ports:
# - 5433:5432
environment:
POSTGRES_USER: postgres
POSTGRES_PASSWORD: postgres
healthcheck:
test: [ "CMD-SHELL", "pg_isready -U postgres" ]
interval: 10s
timeout: 5s
retries: 12
### For Postgres - End ###
config:
# build:
# context: ./config
platform: linux/amd64
image: rtdl/rtdl-config:latest
container_name: rtdl_config
expose:
- 80
ports:
- 80:80
environment:
RTDL_DB_HOST: rtdl-db
RTDL_DB_PORT: 5432
RTDL_DB_USER: rtdl
RTDL_DB_PASSWORD: rtdl
RTDL_DB_DBNAME: rtdl_db
depends_on:
rtdl-db:
condition: service_healthy
##### Config Services - End #####
##### Ingest Service - Start #####
ingest:
# build:
# context: ./ingest
platform: linux/amd64
image: rtdl/rtdl-ingest:latest
container_name: rtdl_ingest
expose:
- 8080
ports:
- 8080:8080
environment:
KAFKA_URL: redpanda:29092
KAFKA_TOPIC: ingress
LISTENER_PORT: 8080
depends_on:
- redpanda
##### Ingest Service - End #####
##### Kafka/Redpanda Services - Start #####
redpanda:
command:
- redpanda
- start
- --smp
- '1'
- --reserve-memory
- 0M
- --overprovisioned
- --node-id
- '0'
- --kafka-addr
- PLAINTEXT://0.0.0.0:29092,OUTSIDE://0.0.0.0:9092
- --advertise-kafka-addr
- PLAINTEXT://redpanda:29092,OUTSIDE://0.0.0.0:9092
# NOTE: Please use the latest version here!
image: docker.vectorized.io/vectorized/redpanda:v21.9.5
container_name: rtdl_redpanda
user: root
volumes:
- ./storage/redpanda/data:/var/lib/redpanda/data
expose:
- 9092
- 9644
- 29092
ports:
- 9092:9092
- 29092:29092
healthcheck:
test: [ "CMD","curl","-f","http://localhost:9644/v1/status/ready" ]
start_period: 30s
interval: 5s
timeout: 2s
retries: 24
##### Kafka/Redpanda Services - End #####
##### Processing Services - Start #####
statefun-manager:
platform: linux/amd64
image: apache/flink-statefun:latest
container_name: rtdl_process-statefun-manager
expose:
- 6123
- 8081
#ports:
# - 8081:8081
environment:
ROLE: master
MASTER_HOST: statefun-manager
volumes:
- ./storage/rtdl-statefun-manager_store:/checkpoint-dir
- ./ingester/module.yaml:/opt/statefun/modules/ingester/module.yaml
# - ./pii-detection/module.yaml:/opt/statefun/modules/pii-detection/module.yaml
depends_on:
- ingest
statefun-worker:
platform: linux/amd64
image: apache/flink-statefun:latest
container_name: rtdl_process-statefun-worker
expose:
- 6121
- 6122
environment:
ROLE: worker
MASTER_HOST: statefun-manager
volumes:
- ./storage/rtdl-statefun-worker_store:/checkpoint-dir
- ./ingester/module.yaml:/opt/statefun/modules/ingester/module.yaml
# - ./pii-detection/module.yaml:/opt/statefun/modules/pii-detection/module.yaml
depends_on:
- statefun-manager
- redpanda
statefun-functions:
# build:
# context: ./ingester
platform: linux/amd64
image: rtdl/process-statefun-functions:latest
container_name: rtdl_process-statefun-functions
expose:
- 8082
environment:
RTDL_DB_HOST: rtdl-db
RTDL_DB_PORT: 5432
RTDL_DB_USER: rtdl
RTDL_DB_PASSWORD: rtdl
RTDL_DB_DBNAME: rtdl_db
DREMIO_HOST: dremio
DREMIO_PORT: 9047
DREMIO_USERNAME: rtdl
DREMIO_PASSWORD: rtdl1234
DREMIO_CLOUD_PROJECT_ID: b2d480bb-1b54-424b-aba9-5294e1010b40
DREMIO_MOUNT_PATH: /mnt/datastore
volumes:
- ./storage/rtdl-data_store:/app/datastore
depends_on:
rtdl-db:
condition: service_healthy
redpanda:
condition: service_healthy
dremio:
condition: service_healthy
##### Processing Services - End #####
##### Dremio Services - Start #####
dremio:
platform: linux/amd64
image: dremio/dremio-oss
container_name: rtdl_dremio
user: root
volumes:
- ./storage/dremio/data:/opt/dremio/data
- ./storage/rtdl-data_store:/mnt/datastore
expose:
- 9047
- 31010
- 45678
ports:
- 9047:9047
- 31010:31010
- 45678:45678
healthcheck:
test: [ "CMD", "curl", "-f", "http://localhost:9047/apiv2/server_status" ]
interval: 10s
timeout: 5s
retries: 24
##### Dremio Services - End #####
##### Hadoop Services - Start #####
namenode:
image: bde2020/hadoop-namenode:2.0.0-hadoop3.2.1-java8
container_name: namenode
restart: always
ports:
- 9870:9870
- 9000:9000
volumes:
- ./storage/hadoop/dfs/name:/hadoop/dfs/name
environment:
- CLUSTER_NAME=test
env_file:
- ./hadoop/hadoop.env
datanode:
image: bde2020/hadoop-datanode:2.0.0-hadoop3.2.1-java8
container_name: datanode
restart: always
volumes:
- ./storage/hadoop/dfs/data:/hadoop/dfs/data
environment:
SERVICE_PRECONDITION: "namenode:9870"
env_file:
- ./hadoop/hadoop.env
resourcemanager:
image: bde2020/hadoop-resourcemanager:2.0.0-hadoop3.2.1-java8
container_name: resourcemanager
restart: always
environment:
SERVICE_PRECONDITION: "namenode:9000 namenode:9870 datanode:9864"
env_file:
- ./hadoop/hadoop.env
nodemanager1:
image: bde2020/hadoop-nodemanager:2.0.0-hadoop3.2.1-java8
container_name: nodemanager
restart: always
environment:
SERVICE_PRECONDITION: "namenode:9000 namenode:9870 datanode:9864 resourcemanager:8088"
env_file:
- ./hadoop/hadoop.env
historyserver:
image: bde2020/hadoop-historyserver:2.0.0-hadoop3.2.1-java8
container_name: historyserver
restart: always
environment:
SERVICE_PRECONDITION: "namenode:9000 namenode:9870 datanode:9864 resourcemanager:8088"
volumes:
- ./storage/hadoop/yarn/timeline:/hadoop/yarn/timeline
env_file:
- ./hadoop/hadoop.env
##### Hadoop Services - End #####
—
Reply to this email directly, view it on GitHub
<#16 (reply in thread)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AB7GDS5WNV6562H65OPWZGTU6ZE7ZANCNFSM5QCGPFQA>
.
Triage notifications on the go with GitHub Mobile for iOS
<https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675>
or Android
<https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.
You are receiving this because you commented.Message ID:
***@***.***>
|
Beta Was this translation helpful? Give feedback.
-
Could you please share the output screenshot after execution of following
commands
sudo docker-compose exec rtdl-db psql -U rtdl -d rtdl_db
Then
select * from streams limit 1;
…On Tue, 8 Mar, 2022, 1:10 am Emre Baykal, ***@***.***> wrote:
- Yes it is in sudo mode.
- Other services are up and running without any issue. Only
statefun-functions has problems. Log says Failed to execute query...
so it is about ingester.go. But there are 4 different lines with same
error. All of them are related with db, though.
- I have been waiting for more than 15 minutes now. No response.
—
Reply to this email directly, view it on GitHub
<#16 (reply in thread)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AB7GDS4XO3X6HNCQLNQ4SSTU6ZLRXANCNFSM5QCGPFQA>
.
Triage notifications on the go with GitHub Mobile for iOS
<https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675>
or Android
<https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.
You are receiving this because you commented.Message ID:
***@***.***>
|
Beta Was this translation helpful? Give feedback.
-
I did a clean setup on Mac OS (Intel) - running without issues. I'm not
using sudo.
Are you sure it was running on the M1 Mac previously? It always gives me
"qemu Segmentation fault" on M1 Mac (2021). I'm not sure Docker is 100%
operational on ARM64.
Also - could you please try to run without sudo?
In order to do that - please remove the directories under "storage" and
re-run init. This is because permissions of the directories would've got
messed up due to sudo.
…On Tue, 8 Mar 2022 at 01:10, Emre Baykal ***@***.***> wrote:
- Yes it is in sudo mode.
- Other services are up and running without any issue. Only
statefun-functions has problems. Log says Failed to execute query...
so it is about ingester.go. But there are 4 different lines with same
error. All of them are related with db, though.
- I have been waiting for more than 15 minutes now. No response.
—
Reply to this email directly, view it on GitHub
<#16 (reply in thread)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AB7GDS4XO3X6HNCQLNQ4SSTU6ZLRXANCNFSM5QCGPFQA>
.
Triage notifications on the go with GitHub Mobile for iOS
<https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675>
or Android
<https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.
You are receiving this because you commented.Message ID:
***@***.***>
|
Beta Was this translation helpful? Give feedback.
-
You need to run the init scripts first once you've cleaned storage.
Also, the logs you've shared show that you're facing qemu error as well
…On Tue, 8 Mar, 2022, 3:51 pm Emre Baykal, ***@***.***> wrote:
This time I got the error below after rm -rf storage, removed all previous
images and started everything from scratch:
***@***.*** rtdl % sudo docker-compose exec rtdl-db psql -U rtdl -d rtdl_db
Password:
psql: error: connection to server on socket "/var/run/postgresql/.s.PGSQL.5432" failed: FATAL: role "rtdl" does not exist
—
Reply to this email directly, view it on GitHub
<#16 (reply in thread)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AB7GDS3V6MK337IUQEH4EZLU64S33ANCNFSM5QCGPFQA>
.
Triage notifications on the go with GitHub Mobile for iOS
<https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675>
or Android
<https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.
You are receiving this because you commented.Message ID:
***@***.***>
|
Beta Was this translation helpful? Give feedback.
-
Excellent news!
…On Wed, 9 Mar 2022 at 00:06, Emre Baykal ***@***.***> wrote:
*Status Update*: Managed to run the project on self-hosted AWS. Deployed
*pii-detection* functionality. Currently testing. Thanks again!
—
Reply to this email directly, view it on GitHub
<#16 (reply in thread)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AB7GDS7H2CD7BGLA5LMHDY3U66M2VANCNFSM5QCGPFQA>
.
Triage notifications on the go with GitHub Mobile for iOS
<https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675>
or Android
<https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.
You are receiving this because you commented.Message ID:
***@***.***>
|
Beta Was this translation helpful? Give feedback.
-
V0.1.1 - Current status -- what works and what doesn't
What works? 🚀
rtdl's initial feature set is built and working. You can use the API on port 80 to
configure streams that ingest json from an rtdl endpoint on port 8080, process them into Parquet,
and save the files to a destination configured in your stream. rtdl can write files locally, to
AWS S3, GCP Cloud Storage, and Azure Blob Storage and you can query your data via Dremio's web UI
at http://localhost:9047 (login with Username:
rtdl
and Passwordrtdl1234
).What's new? 💥
What doesn't work/what's next on the roadmap? 🚴🏼
This discussion was created from the release v0.1.1.
Beta Was this translation helpful? Give feedback.
All reactions