Monday, December 19, 2016

OpenStack Day 2 - Deploying the Undercloud

Finally ready to start pushing some buttons.

I took a final read through the TripleO documentation (http://tripleo.org/index.html) to make sure I understood the pre-deployment activities and made sure to read all the way through looking for any last minute gotchas ("But first remove the fuse!" (M*A*S*H Season 1 episode 20). I'd recommend those of you following along at home do likewise. Or if your pressed for time at least read through the architecture (http://tripleo.org/introduction/architecture.html) section so terminology makes sense.

TripleO can deploy into virtual environments (using VMs as targets). I'm targeting physical machines and will stick to those sections of the install document.

After reading the docs I made the following hardware cutoff for sorting through my stack of unused hardware:

  • Multi-core CPU
  • 4 GB memory
  • 60 BG Free Disk Space
    • machines running ceilometer need to have a separate partition for that MongoDB to avoid running out of space on the root partition... 
  • 2 Gigabit NIC
  • Overcloud machines:
    • All Overcloud machines must support IPMI (Ironic supports a number of drivers: pxe_ipmitool (IPMI), pxe_ilo (HP Proliant Gen 8 and Gen 9), pxe_drac (Dell 11G and newer), fake_pxe (stubs - won't do power management), fake (won't do anything - ironic won't touch the hardware)) ... there are others: http://docs.openstack.org/developer/ironic/#driver-references
    • OverCloud: The NIC used for PXE boot should be the only one enabled for network booting and it should be at the top of the boot order (head of local disk and CD)
  • Ceph Machines should have 1TB RAM per 1TB of storage and it's recommended that the OS be on a separate drive from data for performance. So consider 2 drives a minimum.

A number of other questions come up during deployment and so I made some initial decisions:

  • We will not deploy with SSL initially
  • We will not deploy network isolation (the ability to create isolated networks within openstsack). Network isolation is cool but it also increases complexity for not much gain in my use case. Start simple and then complicate as needed.
  • We will be deploying with Ceph. Arguably a complication but I really like everything I've read about Ceph.
    • How many nodes????
  • We will be deploying with Swift (so glance can store images in swift rather than directly on the controller)
  • We will be deploying only a single controller. This is an Single Point Failure, however a High Availability solution requires 3 controllers and we haven't identified sufficient quantity of hardware. This is one of the first areas that should be addressed during scale-out.
  • Magnum ... I really wanted to use Magnum as well (for containers) but again it's not needed for my initial use case so... later.

I plan to deploy: One Undercloud Director; One Overcloud Controller; At least one compute node but really as many as I can get (we'll know after hardware introspection); Three (3) Ceph nodes (1 monitor, 2 OSD nodes); 2 Networks (the PXE/Management network and the External/Public network). After that we'll discover how easy it is to scale... or not.

Let's get to it.

I downloaded a copy of CentOS 7 (Minimal - 1511 release). The 1611 release has just occurred but I don't want anything too shiny and new because it introduces variables. 1511 was the available release when the TripleO documentation was updated for the OpenStack Newton release so I'm going that direction.

After booting up the install media on my chosen Undercloud Director hardware (Dell R210) I went through the install quickly. I was rather surprised to find that the minimal install required a mouse but I had a USB mouse on hand and just plugged it in. I think I could have gone through the Boot with Troubleshooting option and bypassed the mouse requirement but it looked even more painful.

Since this is the director node I went with a standard LVM disk layout with a couple of changes:

  1. /home seems to get the bulk of the space and since we won't be logging in with users I reduces this down to 100G. This may end up being a bad move because we'll be storing images in /home/stack... we'll see.
  2. Bumped the / partition up to 500G
  3. Added a /mongo partition for later use with ceilometer and let it claim the remaining space (273G).

For the network setup I left Gigiabit 1 (em1) off. I enabled em2 and let it DHCP.

I assigned a root password and created a local user account for myself (with Admin privileges).

When the installation completed I let the system reboot. Once it had rebooted I used putty to SSH into it using my local user account. So far so good.

Here's where I hit my first problem. Some of the software that TripleO wants to install conflicts with a package installed already by the minimal installer. Argh. So I wiped and started over again, repeating the above configuration steps and brining myself back to this point where we can repeat with the benefit of our future knowledge.

Here are the steps I've taken to deploy my Undercloud. Substitute in your hostname for 'myhost.mydomain' and in the following line when editing /etc/hosts (I use 'vi' use whatever you want) make sure to put an entry for that host in your 127.0.0.1 and ::1 lines. The next line 'sudo yum erase -y mariadb-libs' clears up that conflict we mentioned.

sudo useradd stack
sudo passwd stack  # specify a password
echo "stack ALL=(root) NOPASSWD:ALL" | sudo tee -a /etc/sudoers.d/stack
sudo chmod 0440 /etc/sudoers.d/stack

sudo hostnamectl set-hostname myhost.mydomain
sudo hostnamectl set-hostname --transient myhost.mydomain
vi /etc/hosts
sudo yum erase -y mariadb-libs
sudo curl -L -o /etc/yum.repos.d/delorean-newton.repo \
https://trunk.rdoproject.org/centos7-newton/current/delorean.repo
sudo curl -L -o /etc/yum.repos.d/delorean-deps-newton.repo \
http://trunk.rdoproject.org/centos7-newton/delorean-deps.repo
sudo yum -y install --enablerepo=extras centos-release-ceph-jewel
sudo sed -i -e 's%gpgcheck=.*%gpgcheck=0%' /etc/yum.repos.d/CentOS-Ceph-Jewel.repo
sudo yum -y install yum-plugin-priorities
sudo yum install -y python-tripleoclient
sudo yum install -y yajl
sudo su -l stack
cp /usr/share/instack-undercloud/undercloud.conf.sample ~/undercloud.conf
vi undercloud.conf



I have reserved 192.168.50.0/24 for use with the Undercloud PXE/Management network. Tentatively making the following changes to undercloud.conf (it's possible I could have deployed with just a change to local_interface but that would have used a weird default IP Block).


local_ip = 192.168.50.1/24
local_interface = em1
network_gateway = 192.168.50.1
network_cidr = 192.168.50.0/24
masquerade_network = 192.168.50.0/24
dhcp_start = 192.168.50.100
dhcp_end = 192.168.50.250
inspection_iprange = 192.168.50.80,192.168.50.99


I think this says:
  • The director is 192.168.50.1/24 on em1
  • The assigned network gateway for the managmenet network should be .1
  • The whole network is 192.168.50.0/24
  • The masquerade Network is the same
  • The DHCP range is 192.168.50.100 - 192.168.50.250
  • The Introspection IP range is 192.168.50.80 - 192.168.50.99

I'm interpreting this as the assigned IP's on my IPMI will be 192.168.50.2 - 192.168.50.79.

Or so I hope. It might be a while in the future before I discover the error of my ways here...

Make the changes and save ~stack/undercloud.conf

Finally the moment we've been waiting for today:

$ openstack undercloud install

You can make changes and re-run this command until you have Overcloud. After that it's a badness.

If all goes well you should receive a happy message and a couple of files will get created.


#############################################################################
Undercloud install complete.
The file containing this installation's passwords is at
/home/stack/undercloud-passwords.conf.
There is also a stackrc file at /home/stack/stackrc.
These files are needed to interact with the OpenStack services, and should be
secured.
#############################################################################

At this point our undercloud should be usable. We can run a few tests to find out. I should note that I received a LOT of Warnings. Mainly dealing with the use of deprecated commands. My early guess is that some of their scripting (probably in Puppet) hasn't been updated to use Newton (latest version of OpenStack at this time) and it's not a high priority since it's only warnings. Again... we'll see.

[stack@ostack-director ~]$ source stackrc
[stack@ostack-director ~]$ openstack network list


+--------------------------------------+----------+--------------------------------------+
| ID                                   | Name     | Subnets                              |
+--------------------------------------+----------+--------------------------------------+
| 5465303c-736e-4c3b-926a-4f791c265b3e | ctlplane | ae23dfb5-cb87-4475-adc9-ec82062db8af |
+--------------------------------------+----------+--------------------------------------+


[stack@ostack-director ~]$ openstack compute service list

+----+----------------+-----------------+----------+---------+-------+----------------------------+
| ID | Binary         | Host            | Zone     | Status  | State | Updated At                 |
+----+----------------+-----------------+----------+---------+-------+----------------------------+
|  1 | nova-cert      | ostack-director | internal | enabled | up    | 2016-12-19T20:50:55.000000 |
|  2 | nova-scheduler | ostack-director | internal | enabled | up    | 2016-12-19T20:51:00.000000 |
|  3 | nova-conductor | ostack-director | internal | enabled | up    | 2016-12-19T20:50:56.000000 |
|  4 | nova-compute   | ostack-director | nova     | enabled | up    | 2016-12-19T20:50:55.000000 |
+----+----------------+-----------------+----------+---------+-------+----------------------------+

$ifconfig -a
br-ctlplane: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 192.168.50.1  netmask 255.255.255.0  broadcast 192.168.50.255


$ cat /etc/sysconfig/network-scripts/ifcfg-em1
# This file is autogenerated by os-net-config
DEVICE=em1
ONBOOT=yes
HOTPLUG=no
NM_CONTROLLED=no
PEERDNS=no
DEVICETYPE=ovs
TYPE=OVSPort
OVS_BRIDGE=br-ctlplane
BOOTPROTO=none
MTU=1500


So there's something in there... DEICETYPE=ovs. Yum. Lots of discussion on that coming.

Next up: Getting the other hardware registered with Ironic.

1 comment:

  1. Hi Steve,

    Can you share the environment's diagram? Im little bit confuse with provision and external 'network.

    Regards,
    Melborn

    ReplyDelete