Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 6 additions & 1 deletion .github/workflows/build.yml
Original file line number Diff line number Diff line change
Expand Up @@ -4,13 +4,18 @@ on:
jobs:
build:
runs-on: ubuntu-latest
permissions:
contents: write
steps:
- uses: actions/checkout@v2
- uses: ruby/setup-ruby@v1
with:
bundler-cache: true
- name: Build
run: |
git config user.name github-actions
git config user.email github-actions@github.com
ruby build.rb
bundle exec ruby build.rb
if ! git diff --exit-code
then
git add .
Expand Down
2 changes: 2 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -1,5 +1,7 @@
.DS_Store
.ipynb_ckeckpoints
.bundle/
vendor/

## Terraform
**/.terraform/*
Expand Down
6 changes: 6 additions & 0 deletions .rubocop.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@

Layout/HashAlignment:
Enabled: false

Layout/LeadingEmptyLines:
Enabled: false
2 changes: 1 addition & 1 deletion .ruby-version
Original file line number Diff line number Diff line change
@@ -1 +1 @@
3.1.2
4.0.5
5 changes: 5 additions & 0 deletions Gemfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
source "https://rubygems.org"

gem "liquid"
gem "base64"
gem "cgi"
29 changes: 29 additions & 0 deletions Gemfile.lock
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
GEM
remote: https://rubygems.org/
specs:
base64 (0.3.0)
bigdecimal (4.1.2)
cgi (0.5.1)
liquid (5.12.0)
bigdecimal
strscan (>= 3.1.1)
strscan (3.1.8)

PLATFORMS
arm64-darwin-24
ruby

DEPENDENCIES
base64
cgi
liquid

CHECKSUMS
base64 (0.3.0) sha256=27337aeabad6ffae05c265c450490628ef3ebd4b67be58257393227588f5a97b
bigdecimal (4.1.2) sha256=53d217666027eab4280346fba98e7d5b66baaae1b9c3c1c0ffe89d48188a3fbd
cgi (0.5.1) sha256=e93fcafc69b8a934fe1e6146121fa35430efa8b4a4047c4893764067036f18e9
liquid (5.12.0) sha256=5a3c2c2430cd925d21c53e4ed9abea52cd0a9da53b541422f81dee79aca2a673
strscan (3.1.8) sha256=aae2db611a225559f21ffbb71765c9a4e60fd262534a9ea84f4f11c7f32f679e

BUNDLED WITH
4.0.10
26 changes: 20 additions & 6 deletions LINUX.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,4 @@

# Setup instructions

You will find below the instructions to set up your computer for [Le Wagon Data Engineering course](https://www.lewagon.com/)
Expand Down Expand Up @@ -325,7 +326,6 @@ The `gcloud` Command Line Interface (CLI) is used to communicate with Google Clo




Add the `APT` repository and install with:

```bash
Expand All @@ -345,6 +345,7 @@ gcloud --version




### Authenticate gcloud

We need to authenticate the `gcloud` CLI tool and set the project so it can interact with Google from the terminal.
Expand All @@ -370,6 +371,7 @@ We recommend allowing **Google Auth Library** to: _View and sign in to your Goog
For pasting into the terminal, your might need to use `ctrl + shift + v`



You also need to set the GCP project that your are working in. For this section, you'll need your **GCP Project ID**, which can be found on the GCP Console at this [link here 🔗](https://console.cloud.google.com). Makes sure you copy the _Project ID_ and **not** the _Project number_.

To set your project, replace `<YOUR_PROJECT_ID>` with your GCP Project ID and run:
Expand Down Expand Up @@ -418,7 +420,6 @@ Terraform is a tool for [Infrastructure as Code (IaC) 🔗](https://en.wikipedia




Install some system requirements requirements:
```bash
sudo apt-get update && sudo apt-get install -y gnupg software-properties-common
Expand Down Expand Up @@ -450,6 +451,7 @@ sudo apt-get install terraform
```



Verify the installation with:

```bash
Expand All @@ -462,9 +464,13 @@ The output should look similar to:
Terraform v1.14.3
on <your_operating_system>_<your_cpu_architecture>




# Linux example
# Terraform v1.14.3
# on linux_amd64

```


Expand Down Expand Up @@ -508,7 +514,6 @@ First we'll create a folder and download the terraform files with:




```bash
mkdir -p ~/code/wagon-de-bootcamp
```
Expand All @@ -521,14 +526,15 @@ curl -L -o ~/code/wagon-de-bootcamp/main.tf https://raw.githubusercontent.com/le
```


### Set variables

### Set variables



Open up the file `~/code/wagon-de-bootcamp/terraform.tfvars` in VS Code or any other code editor.



It should look like:

```bash
Expand All @@ -544,8 +550,10 @@ We'll need to change some values in this file. Here's were you can find the requ
- **region:** take a look at the GCP Region and Zone documentation at this [link here](https://cloud.google.com/compute/docs/regions-zones#available). We generally recommend you choose a geographically nearby region.
- **zone:** Zone is a subset of region. it is almost always the same as **region** appended with `-a`, `-b`, or `-c`. The zone you select within a region should not have a functional impact.
- **instance_name:** we recommend naming your VM: `lw-de-vm-<YOUR_GITHUB_USERNAME>`. Replacing `<YOUR_GITHUB_USERNAME>` with your GitHub username.

- **instance_user:** in your terminal, run `whoami`, and enter the value


After completing this file, it might look similar to:

```bash
Expand All @@ -559,7 +567,9 @@ instance_user = "taylorswift"
Make sure to save the `terraform.tfvars` file, navigate into the directory with the terraform files using your terminal with:

```bash

cd ~/code/wagon-de-bootcamp

```

Initialise and test the terraform config files with:
Expand Down Expand Up @@ -624,6 +634,7 @@ For example, try running:
```



### Connect with VS Code

To connect to your Virtual Machine, click on the small symbol at the very bottom-left corner of VS Code:
Expand Down Expand Up @@ -701,6 +712,7 @@ We recommend allowing **Google Auth Library** to: _View and sign in to your Goog
For pasting into the terminal, your might need to use `ctrl + shift + v`



You also need to set the GCP project that your are working in. For this section, you'll need your **GCP Project ID**, which can be found on the GCP Console at this [link here 🔗](https://console.cloud.google.com). Makes sure you copy the _Project ID_ and **not** the _Project number_.

To set your project, replace `<YOUR_PROJECT_ID>` with your GCP Project ID and run:
Expand Down Expand Up @@ -823,7 +835,7 @@ We will use the GitHub CLI (`gh`) to connect to GitHub using *SSH*, a protocol t

First in order to **login**, copy-paste the following command in your terminal:

:warning: **DO NOT edit the `email`**
:warning: **DO NOT edit the `email`** — Even though `user:email` looks like a placeholder for your actual email address, it isn't — do not replace it.

```bash
gh auth login -s 'user:email' -w --git-protocol ssh
Expand All @@ -835,7 +847,9 @@ gh auth login -s 'user:email' -w --git-protocol ssh

If you already have SSH keys, you will see instead `Upload your SSH public key to your GitHub account?` With the arrows, select your public key file path and press `Enter`.

- `Enter a passphrase for your new SSH key (Optional)`. Type something you want and that you'll remember. It's a password to protect your private key stored on your hard drive. Then press `Enter`.
- `Enter a passphrase for your new SSH key (Optional)`:
- **FOR MOST PEOPLE:** Just press `Enter` to skip. You don't need a passphrase for the bootcamp and it would prompt you every time you use the key. There is a risk, however, that if someone steals your laptop, they could then push to GitHub.
- **IF SECURITY IS REALLY IMPORTANT TO YOU:** Enter a passphrase of your choice and press `Enter`. It's _really_ important that if you enter a passphrase, you write it down somewhere immediately and do not lose/forget it. You will need to enter this frequently.

- `Title for your SSH key`. You can leave it at the proposed "GitHub CLI", press `Enter`.

Expand Down
17 changes: 12 additions & 5 deletions LINUX_keep_current.md
Original file line number Diff line number Diff line change
Expand Up @@ -86,10 +86,12 @@ type -a pyenv > /dev/null && eval "$(pyenv init --path)"

Update pyenv :


``` bash
cd $(pyenv root) && git pull
```


Install the current python version :

```bash
Expand Down Expand Up @@ -140,10 +142,12 @@ pyenv versions
pip install -U pip
```


``` bash
pip install -r https://raw.githubusercontent.com/lewagon/data-setup/master/specs/releases/linux.txt
```


## GCP

Make sure that the `gcloud` command is linked to the email address of your Google Cloud Platform account :
Expand Down Expand Up @@ -212,8 +216,7 @@ cat $GOOGLE_APPLICATION_CREDENTIALS
{
"type": "service_account",
"project_id": "your-gcp-project-id",
"private_key_id": "a2d4a2d4a2d4a2d4a2d4a2d4a2d4a2d4a2d4a2d4",
"private_key": "-----BEGIN PRIVATE KEY-----\nMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInMIInM=\n-----END PRIVATE KEY-----\n",
"private_key_id": "...",
"client_email": "your-service-account@your-service-account.iam.gserviceaccount.com",
"client_id": "105410541054105410541",
"auth_uri": "https://accounts.google.com/o/oauth2/auth",
Expand Down Expand Up @@ -252,12 +255,14 @@ gcloud auth configure-docker

## Docker


Start Docker :

``` bash
sudo service docker start
```


Verify that Docker can run the hello-world image :

``` bash
Expand All @@ -266,28 +271,30 @@ docker run hello-world

👉 Make sure that this command completes correctly


Start Docker :

``` bash
sudo service docker stop
```



### Python setup check up

Check your Python version with the following commands:
```bash
zsh -c "$(curl -fsSL <PYTHON_CHECKER_URL>)" 3.8.12
zsh -c "$(curl -fsSL https://raw.githubusercontent.com/lewagon/data-setup/master/checks/python_checker.sh)" 3.8.12
```

Run the following command to check if you successfully installed the required packages:
```bash
zsh -c "$(curl -fsSL <PIP_CHECKER_URL>)"
zsh -c "$(curl -fsSL https://raw.githubusercontent.com/lewagon/data-setup/master/checks/pip_check.sh)"
```

Now run the following command to check if you can load these packages:
```bash
python -c "$(curl -fsSL <PIP_LOADER_URL>)"
python -c "$(curl -fsSL https://raw.githubusercontent.com/lewagon/data-setup/master/checks/pip_check.py)"
```

Make sure you can run Jupyter:
Expand Down
18 changes: 16 additions & 2 deletions WINDOWS.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,4 @@

# Setup instructions

You will find below the instructions to set up your computer for [Le Wagon Data Engineering course](https://www.lewagon.com/)
Expand Down Expand Up @@ -397,9 +398,13 @@ The output should look similar to:
Terraform v1.14.3
on <your_operating_system>_<your_cpu_architecture>



# Windows example
# Terraform v1.14.3
# on windows_amd64


```


Expand Down Expand Up @@ -483,8 +488,10 @@ We'll need to change some values in this file. Here's were you can find the requ
- **region:** take a look at the GCP Region and Zone documentation at this [link here](https://cloud.google.com/compute/docs/regions-zones#available). We generally recommend you choose a geographically nearby region.
- **zone:** Zone is a subset of region. it is almost always the same as **region** appended with `-a`, `-b`, or `-c`. The zone you select within a region should not have a functional impact.
- **instance_name:** we recommend naming your VM: `lw-de-vm-<YOUR_GITHUB_USERNAME>`. Replacing `<YOUR_GITHUB_USERNAME>` with your GitHub username.

- **instance_user:** in Command Prompt, run `echo %username%`, and enter the value - try and remember your username, you will need it later on


After completing this file, it might look similar to:

```bash
Expand All @@ -498,7 +505,9 @@ instance_user = "taylorswift"
Make sure to save the `terraform.tfvars` file, navigate into the directory with the terraform files using your terminal with:

```bash

cd %USERPROFILE%\wagon-de-bootcamp

```

Initialise and test the terraform config files with:
Expand Down Expand Up @@ -562,6 +571,7 @@ For example, try running:
# $ ssh lw-de-vm-<GITHUB_USERNAME>.<GCP_ZONE>.<GCP_PROJECT_ID>
```


### Confirm Your SSH Settings

Let's take a look at the SSH configuration that was just created and verify it. In VS Code:
Expand Down Expand Up @@ -634,6 +644,7 @@ icacls %USERPROFILE%\.ssh\google_compute_engine /grant:r SYSTEM:(R) && ^
icacls %USERPROFILE%\.ssh\google_compute_engine
```


### Connect with VS Code

To connect to your Virtual Machine, click on the small symbol at the very bottom-left corner of VS Code:
Expand Down Expand Up @@ -707,6 +718,7 @@ It's is usually the first check box.
We recommend allowing **Google Auth Library** to: _View and sign in to your Google Cloud SQL instances._



For pasting into the terminal, your might need to use `ctrl + shift + v`


Expand Down Expand Up @@ -833,7 +845,7 @@ We will use the GitHub CLI (`gh`) to connect to GitHub using *SSH*, a protocol t

First in order to **login**, copy-paste the following command in your terminal:

:warning: **DO NOT edit the `email`**
:warning: **DO NOT edit the `email`** — Even though `user:email` looks like a placeholder for your actual email address, it isn't — do not replace it.

```bash
gh auth login -s 'user:email' -w --git-protocol ssh
Expand All @@ -845,7 +857,9 @@ gh auth login -s 'user:email' -w --git-protocol ssh

If you already have SSH keys, you will see instead `Upload your SSH public key to your GitHub account?` With the arrows, select your public key file path and press `Enter`.

- `Enter a passphrase for your new SSH key (Optional)`. Type something you want and that you'll remember. It's a password to protect your private key stored on your hard drive. Then press `Enter`.
- `Enter a passphrase for your new SSH key (Optional)`:
- **FOR MOST PEOPLE:** Just press `Enter` to skip. You don't need a passphrase for the bootcamp and it would prompt you every time you use the key. There is a risk, however, that if someone steals your laptop, they could then push to GitHub.
- **IF SECURITY IS REALLY IMPORTANT TO YOU:** Enter a passphrase of your choice and press `Enter`. It's _really_ important that if you enter a passphrase, you write it down somewhere immediately and do not lose/forget it. You will need to enter this frequently.

- `Title for your SSH key`. You can leave it at the proposed "GitHub CLI", press `Enter`.

Expand Down
Loading