Up to this point things have actually gone pretty smoothly. This isn't my first foray into deploying a bit of OpenStack. My first attempt was using an Ubuntu/Canonical approach which utilized a bit of Canonical Magic called MAAS: Metal As A Service (https://maas.io). It's a pretty slick idea and I was excited to try it. In some ways it seemed like a solution in search of a problem but I'm geeky enough to want to do it for it's own sake.
I failed. My understanding of IPMI (as well as PXE, BMC, etc..) was lacking and I didn't want to take the time to fill that gap just to be able to use MAAS. It wasn't magic enough. In fairness it probably would be if you were using hardware that was new enough to actually support IPMI properly.
So IPMI is at the root of it all. What's IPMI? IPMI stands for the Intelligent Platform Management Interface. The specification was led by Intel and first published way back in 1998 (that date is important). It provides for an autonomous subsystem that provides management, and monitoring, capabilities for the host's CPU, firmware, and OS. It allows Administrators to administer the system out of band using a connection directly to the hardware. In the early days this was a dedicated port but later most vendors added the capability to one of the on-board Network Interface Cards (NICs) and so IPMI shares with Ethernet (so called side-band rather than out of band).
One of the things that IPMI can manage and monitor is power. To do this the host system has an IPMI subsystem called the baseboard management controller (BMC) which is generally powered up and running even if the system is "off". The BMC is the central processor for all IPMI interactions with the various host components. So a rack full of servers controlled by IPMI can be powered up and down as needed. This allows for power savings and potentially longer lived systems.
IPMI is only really available on server class hardware. Commodity, home use or desktop hardware really has little need for this type of technology. Vendors have released various implementations such as the Dell DRAC, HP integrated Lights-Out (iLO), IBM Remote Supervisor Adapter, etc...
Like most things technology related IPMI has multiple versions. V1.0 as we mentioned was 1998. v1.5 was released in 2001 and added IPMI over LAN. v2.0 was 2004 and added support for VLAN and some security features. v2.0rev1.1 was published on 2014 and added support for IPv6. Finally, 2.0rev1.1errata7 was published in 2015 with clarifications and a couple of more secure protocols.
And this versioning is really the rub... or really the vendor support or lack thereof. The beauty of standards is that there are so many to choose from and often they contain optional pieces or fuzzy areas where vendors just make a call. My pile of hardware is largely composed of older Dell servers ranging from Gen 9s (850, 1950, 2950) up to a couple of Gen 11s (R210, R310, R410 and R515). I'm also using a couple of HP ProLiant DL320e-Gen8 servers. Digging through my pile of unused hardware yielded 16 possible machines. More than enough. The specs on some were not great (particularly in the RAM area but many also only have a single 250G HD). But my thinking was that if they are usable, then a few components will be cheap compared to entire systems.
When I was using MAAS it completely failed to recognize anything older than the Gen 11 Dell servers. At the time I only had two of those so rolling out my five node proof of concept cloud wasn't going to happen. So I ditched MAAS and hand rolled some OpenStack. Brutal.
This time however we're using OpenStack baremetal (Ironic) to manage the physical nodes. Since my goal is to learn the ins and outs of OpenStack then figuring out how work Ironic is most certainly on the agenda in a way that learning MAAS really wasn't.
With my mind firmly on the results of the MAAS experiment, I chose to start with a Dell PE2950. This is a Generation 9 machine circa 2006 (keep in mind IPMI v2.0 was published in 2004). If this node works then probably everything is going to work. If its even a little bit shaky then it'll be the hardest to get working and everything else will be easier afterward. Right?
On with the show.
We have our undercloud director up and running. It has an externally accessible IP on interface 2 and an IP address of 192.168.50.1/24 on it's first interface. We deliberately chose the first interface (eth0, gig0, g0, whatever) as the Management/PXE interface because a lot (all?) of the Dell DRAC implementations only support IPMI on the first interface.
So I add a management switch to the mix and plug port 1 from the undecloud director and port one of my first node: The Dell PE2950.
To get the IPMI working we really only need three pieces of information: The IP Address of the node, and the logic credentials (username/password). Boot up the 2950 and go into SetUp. In setup we want to ensure that virtualization is enabled in the CPU and that we will be PXE booting off the first interface and that should be first in the boot order. Next we go through the boot up sequence again and press Ctrl-E when the BMC/IPMI setup section arrives. I define a static IP address of 192.168.50.7, netmask 255.255.255.0 and gateway of 192.168.50.1. I set the username to 'root' (because it was the default and why confuse it) and password to 'OpenStack' (to be changed later but we aren't even sure if this is going to work yet). Save everything and power off the 2950.
Note: You should recall from Day 2 (http://www.thefullstacky.com/2016/12/openstack-day-2-deploying-undercloud.html) that I'm under the impression I can use 192.168.50.2 - 192.168.50.79 for IPMI. The .7 choice was a bit random and I like 7s.
SSH into the undercloud director. From the command prompt we should be able to ping the BMC of the new node because the DRAC/BMC should be powered up even if the host is "off".
$ ping 192.168.50.7
PING 192.168.50.7 (192.168.50.7) 56(84) bytes of data.
64 bytes from 192.168.50.7: icmp_seq=1 ttl=64 time=0.928 ms
64 bytes from 192.168.50.7: icmp_seq=2 ttl=64 time=0.489 ms
64 bytes from 192.168.50.7: icmp_seq=3 ttl=64 time=0.440 ms
^C
--- 192.168.50.7 ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 1999ms
rtt min/avg/max/mdev = 0.440/0.619/0.928/0.219 ms
PING 192.168.50.7 (192.168.50.7) 56(84) bytes of data.
64 bytes from 192.168.50.7: icmp_seq=1 ttl=64 time=0.928 ms
64 bytes from 192.168.50.7: icmp_seq=2 ttl=64 time=0.489 ms
64 bytes from 192.168.50.7: icmp_seq=3 ttl=64 time=0.440 ms
^C
--- 192.168.50.7 ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 1999ms
rtt min/avg/max/mdev = 0.440/0.619/0.928/0.219 ms
So far so good.
To register the nodes to baremetal we'll create a JSON file that has the required information in it. We know the IP, username and password so we just need to provide a pm_type. The standard type is 'pxe_ipmitool' which uses the ipmitool utility. The TripleO docs recommend using that driver for about everything. There is also 'pxe_ilo' which is recommended for HP Gen 8/9 machines and 'pxe_drac' which they recommend on Gen 11 and newer Dell systems. The PE2950 is Dell Gen 9 so we'll go with pxe_ipmitool. Our final JSON file looks like this (the 'Name' tag is optional it just makes me happy and it's easier relate to the hardware rather than the IP address of the giant UUID).
$ cat instackenv.json
{
"nodes": [
{
"pm_type":"pxe_ipmitool",
"pm_addr":"192.168.50.2",
"pm_user":"root",
"pm_password":"OpenStack",
"name":"TAG-203"
}
]
}
Run the json validator to make sure we don't make any silly errors:
$ json_verify < instackenv.json
JSON is valid
Cross fingers and toes and click the import button. First we'll re-source the stackrc file just to make sure we have all of our environment variables setup properly. When we launch the import it actually starts an Openstack Mistral workflow... pretty cool stuff.
$ . ~/stackrc
$ openstack baremetal import instackenv.json
Started Mistral Workflow. Execution ID: 1d3fe9c5-35e7-4463-a2d4-b17ed7365630
Successfully registered node UUID 1dbb723e-d9a5-4431-b77a-6969588355ff
Started Mistral Workflow. Execution ID: 8716cf03-4fae-45d0-8408-6f2117ab0344
Failed to set nodes to available state: IronicAction.node.set_provision_state failed: <class 'ironicclient.common.apiclient.exceptions.BadRequest'>: The requested action "provide" can not be performed on node "1dbb723e-d9a5-4431-b77a-6969588355ff" while it is in state "enroll".
*sigh* And so it begins...
The 'pxe_ipmitool' driver uses the ipmitool.... you can also use that tool manually from the command line which is pretty interesting.
$ ipmitool -H 192.168.50.7 -U root -P OpenStack -N 5 channel info
Activate Session command failed
Error: Unable to establish LAN session
Now that is pretty frustrating because Dell documentation says that should just work.
The ipmitool supports a few different IPMI Interfaces:
- open – Linux OpenIPMI interface
- imb – Intel IMB
- lan – IPMI v1.5 LAN interface
- lanplus – IPMI v2.0 RMCP+ LAN interface
$ ipmitool -I lanplus -H 192.168.50.7 -U root -P OpenStack -N 5 channel info
Error: Unable to establish IPMI v2 / RMCP+ session
Nope. I've seen multiple examples of this working on the Internet so it is doubly confusing.
To rule out hardware I replaced the 2950 with a 1950 and configured it in the same fashion. Struck out in the exact same fashion. Possibly firmware versions but some of the articles that show it working are pretty old. Tried a settings reset on the IPMI config, still with no luck. Double checked to make sure I hadn't set an RCMP key. nope.
Okay then. So Ironic actually provides two other drivers that are considered "testing" drivers:
- fake_pxe provides stubs instead of real power and management operations. When using this driver, you have to conduct power on and off operations, and set the current boot device, yourself.
- fake provides stubs for every operation, so that Ironic does not touch hardware at all.
We'll go down this path in the near future but the next step is to move to newer hardware to see what's going to work and what isn't... in other words to understand the scope of what we'll be faking. Pushing a power button or 3, no worries. Pushing 16... ick. If there's enough working hardware then fake_pxe will be delayed until we experiment with adding additional Compute nodes and/or support for Swift/Cinder/Ceph.
The Dell Gen 10 hardware proved slightly better but still not working. The Gen 11 hardware worked perfectly... with the pxe_ipmitool driver. It didn't work with the pxe_drac driver which was the recommended driver for Gen 11 and newer. Similarly the HP Gen 8 worked fine with pxe_ipmitool but not with pxe_ilo. In the end out of 16 machines I started with only 6 made the final cut after ruling out IPMI issues and bad hardware (I pulled this stuff out of a stack... parts are going to be bad).
So my final line up looks like this:
(1) Dell R210
(2) Dell R310
(1) Dell R410
(1) Dell R515
(1) HP ProLiant DL320e-Gen8
I had initially been running the undercloud director on the Dell R210 however I'll be re-installing the director on one of the Gen 9 or Gen 10 machines as the Director doesn't require IPMI and that'll make one less machine I have to fake. That'll bring me up to 7 machines in the initial cluster. I'm thinking one Controller, 3 Ceph and 2 Compute... we'll see. But for now that means we delay playing with the fake_pxe driver.
$ cat instackenv.json
{
"nodes": [
{
"pm_type":"pxe_ipmitool",
"pm_addr":"192.168.50.2",
"pm_user":"root",
"pm_password":"OpenStack",
"name":"TAG-203"
},
{
"pm_type":"pxe_ipmitool",
"pm_addr":"192.168.50.3",
"pm_user":"root",
"pm_password":"OpenStack",
"name":"TAG-207"
},
{
"pm_type":"pxe_ipmitool",
"pm_addr":"192.168.50.4",
"pm_user":"root",
"pm_password":"OpenStack",
"name":"TAG-183"
},
{
"pm_type":"pxe_ipmitool",
"pm_addr":"192.168.50.5",
"pm_user":"root",
"pm_password":"OpenStack",
"name":"TAG-206"
},
{
"pm_type":"pxe_ipmitool",
"pm_addr":"192.168.50.6",
"pm_user":"root",
"pm_password":"OpenStack",
"name":"TAG-202"
}
]
}
$ openstack baremetal import instackenv.json
Started Mistral Workflow. Execution ID: 14f7d053-e749-4abc-999a-70f3de2f1de8
Successfully registered node UUID fef86621-9491-48af-b5c6-2104bc88a7fc
Successfully registered node UUID 5769ec4d-181e-4e1d-87dd-a6e3891ecf6d
Successfully registered node UUID 75554bfa-b300-48c3-b6d8-ce3f68c67859
Successfully registered node UUID cde855a9-0188-4912-a2c2-06dc55e582f7
Successfully registered node UUID 93df3756-9f61-47ce-b12b-c5f2b3ab846f
Started Mistral Workflow. Execution ID: 93b3f11c-8e8b-4081-91bf-d6723cd58b81
Successfully set all nodes to available.
$ openstack baremetal node list
+--------------------------------------+---------+------+-------------+-----------------+-------------+
| UUID | Name | UUID | Power State | Provision State | Maintenance |
+--------------------------------------+---------+------+-------------+-----------------+-------------+
| fef86621-9491-48af-b5c6-2104bc88a7fc | TAG-203 | None | power off | available | False |
| 5769ec4d-181e-4e1d-87dd-a6e3891ecf6d | TAG-207 | None | power off | available | False |
| 75554bfa-b300-48c3-b6d8-ce3f68c67859 | TAG-183 | None | power off | available | False |
| cde855a9-0188-4912-a2c2-06dc55e582f7 | TAG-206 | None | power off | available | False |
| 93df3756-9f61-47ce-b12b-c5f2b3ab846f | TAG-202 | None | power off | available | False |
+--------------------------------------+---------+------+-------------+-----------------+-------------+
And that's that. I've started some of the Gen9 and Gen10 servers running the big CD based driver updates for their various models. Might help. Might not. Usually good to be current.
Next up is the introspection phase where IPMI will tell the machines to power on and they will PXE boot a bit of software that will evaluate the node's capabilities. This allows for automatic classification of servers into certain roles if you choose to go that route. Otherwise it's just good info to see and proves that everything is working properly. We'll discuss flavors (part of the node classification in this context) briefly and that'll be it for the prep phase and we'll finally move onto deploying the OverCloud.
Have a great new years everyone. And if you have thoughts on my ongoing IPMI issues with Dell Gen9/10 hardware please let me know.
yurtdışı kargo
ReplyDeleteresimli magnet
instagram takipçi satın al
yurtdışı kargo
sms onay
dijital kartvizit
dijital kartvizit
https://nobetci-eczane.org/
PAM2
Hollanda yurtdışı kargo
ReplyDeleteİrlanda yurtdışı kargo
İspanya yurtdışı kargo
İtalya yurtdışı kargo
Letonya yurtdışı kargo
HJAO4
Litvanya yurtdışı kargo
ReplyDeleteLüksemburg yurtdışı kargo
Macaristan yurtdışı kargo
Malta yurtdışı kargo
Polonya yurtdışı kargo
RJ4XM
Yunanistan yurtdışı kargo
ReplyDeleteAfganistan yurtdışı kargo
Amerika Birleşik Devletleri yurtdışı kargo
Amerika Samoası yurtdışı kargo
Angola yurtdışı kargo
ZYU