-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathTODO
129 lines (112 loc) · 15.2 KB
/
TODO
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
Modeling discussion:
- Customer allows us to associated deployed resources with a customer, which ultimately helps us to bill them, know when to cut off services, etc.
- App allows us to provide different levels of service on a per-application basis even for the same Customer -- some Apps will be under different contracts, some will be more important to the customer, etc.
- Instance allows us to have different Services for the same App, which may be deployed to different Hosts, deployed in multiples (e.g., "4 production app servers"), or only deployed sometimes (e.g., if we deploy a continuous integration server or staging server or load test server only sometimes for a given app).
- Service allows us to express what is necessary for a given Instance to work. E.g., for the production Instance of this customer's web app we need a web server, ssl certs, the rails stack, a postgres database, and a search engine. By expressing the services by name, the actual details of the deployment and management of Services can be handled by appropriately named classes, manifests, scripts, etc., via some deployment tool.
- Services have a DAG (directed acyclic graph) dependency relationship so that: we can specify just a few instance dependencies and get the remaining Service dependencies for free via the graph (i.e., ease of maintenance); also, the DAG implies a topological sort ordering which can be used by shell scripts or non-order-aware tools (I think chef falls into this category, probably cfengine; definitely capistrano and any ad hoc script-your-deployment tools) to get things deployed in the proper order.
- Host represents a compute resource which can be managed effectively by a tool like puppet, chef, cfengine, etc. Typically these are the resources we pay for and need to manage most effectively.
- A Deployment is a statement that an Instance is live and should have s
Services deployed to some set of Hosts. Using the Deployment information, we can generate deployment descriptions (e.g., puppet manifests, an ordering of chef recipes, etc.) for every Host which should have Services deployed.
- A Deployable represents a snapshot of the information used to deploy an Instance at a certain point in time. Given that the relationships between Services may change while an Instance is deployed, we either need to snapshot the information so that the Services deployed for the Instance do not change, or we need to do something more sophisticated with managing the Service DAG over time. I'm not smart enough to know how to do the latter at the moment.
- An Audit edge in the Service DAG implies that a certain Service always needs the auditing Service deployed with it. This allows us to support Services which need to know about other Services, common examples include: firewall rules, log rotation, backups, database backups, monitoring.
- A Deployed Service allows us to deploy a Service for an Instance to a Host. Since "audits" imply that one Service in the DAG may need to be on a separate Host than anotherSservice (though maybe regular Services imply this as well) we need a many:many mapping between Deployments and Hosts, which is what the Deployed Service is.
- An auditing relationship between Deployed Services show the operational Instance of a Deployed Service which knows about another Deployed Service -- mirroring the knowledge-level audit edge in the Service graph which represents one Service knowing about another. When doing deployments ideally we would started auditing Services after their audited Services, and when un-deploying we'd work in the opposite direction (if that is supported by the deployment tool which gets information from larry).
- Pools of hosts allow us to deploy instances to specific batches of hosts. This may be because we have a Pool of hosts for development, testing, production and want to deploy different Deployables to each. It could be because organizational divisions have different pools of physical hardware or billable VMs. It could be because we pool our VPS-provided slices based on some cost function, etc. As we get into capacity-based deployment, we can determine if the pool we're deploying to has sufficient capacity, and otherwise we can deploy more resources into the Pool.
- a Preference is a Specification-pattern implementation which assists in determining whether a Deployment can be successfully pushed out to a set of Hosts, or whether some sort of additional provisioning needs to take place. Specification composition allows building up more complex rule sets. Linking Preferences to Instances allows per-Instance deployment rules; similarly for Apps and Customers (Customers getting least priority, Apps next, Instances next). Conceivably there could be some sort of per-Service or per-Service-Edge (e.g., two Services must reside on the same Host, or must not), or some more sophisticated relationships between entities (this App's Instances cannot reside on the same Hosts as any of the Instances for this Customer), etc. Currently this is only being developed to do simple Role/Grade-based specifications.
- A Role is essentially a tag for Hosts, specifying what sort of work they're intended to handle. Examples might include "web server", "loghost", "firewall", etc. This might be used to put all continuous integration instances on one type of Host, or (with Service Preferences) to say that all firewall rules go on a firewall Host, etc.
- A Grade is a grouping for Hosts based upon some performance or capacity. This can be purely ad hoc and local, and is designed for administrators to set up to do capacity planning. So, a rails app requires a medium-sized server, a postgres production database requires a large server, and a build server requires a huge server ("medium-sized", "large", and "huge" being the Grades). This allows us to map our arbitrary capacities to hosting providers' arbitrary capacities so that we can then automatically say things like "well, to deploy this production db we need to bring up another large host, and our options are a linode 1080, a slicehost 1024, an amazon medium node, or craphost's mega or kinda servers, which do you want?"
- As discussed with Grade, hosting providers modelled by rubycloud (or presumably any other suitable provisioning library) come in provider-designated capacities or performances. We call these Flavors here and will map these to our local Grades so that we can determine what our provisioning options are.
- A Provider is just some hosting organization that offers the ability to turn on hosts of a certain specification on demand. We presume we'll be using rubycloud or deltacloud or libcloud or some such cloudy provisioning tool to do the legwork there.
---
insights from release (and ongoing review) of puppet-dashboard:
- could definitely make use of their styling, which would force us back to GPL
- the activity feed is cool, but we don't need it, yet; would be nice to track deployments / undeployments, maybe service changes, host additions, etc.
- there are a couple of tweaks and fixes that might be useful, mostly it's just the merging of node/report stuff, visual tweaks, ajaxyness and autocompletes
- should we pick up authentication bits and save some effort?
- would be interesting to make the reporting a Rack slice (rails metal, pancake app, etc.), or perhaps numerous pieces (nodes are core, reporting is a slice on top of nodes, customer/app/services/deployment is another; provisioning is yet another; perhaps CRM and other customer-related data merging is yet another)
---
["D" -> deployment milestone]
["P" -> provisioning milestone]
- Q: should there be merging of services, or do we expect something like puppet to take care of that? what if puppet is not using it? E.g., we want apache installed, by 10 apps, but only one apache installation should happen. We can deduce this by the fact that there aren't any variations in the parameters applied to apache for the various instances. Should we do something with this?
D puppet manifest should probably include a class for each customer, a class for each app, a class for each instance, in case we need to hook anything off of them (these would probably have the settings at the appropriate level (i.e., customer class would get only customer-level parameter settings, etc.))
D model: should snapshot services + parameters into the deployable and read from there
D LDAP auth?
D view:
- list of *current* deployments
- link to undeploy manually, DeploymentsController#undeploy calls Deployment#undeploy()
- normally, hide deactivated deployment
- show deactived deployments in deploy dropdown, marking them as deactivated (never deployed)
D view: edit deployment
- can only shorten end_time for current deployments
- for future deployments, can set anything (obviously validators will fire)
D implement a dirty? predicate for deployable:
- true unless there's a snapshot (degenerate)
- true if taking a new snapshot would differ from the current snapshot
- this would show up when showing current deployments
- this would show up when showing deployments to choose from
---
- really need some sort of visibility of the time range that instances, hosts deployments are going to be in effect (definitely raphael graphs would rock)
- get better page titles showing up
- should probably do sorting by case-insensitive names
- links in descriptions could be auto-linked on display
- probably need way more h() calls in the views
- talk w/ stahnma: look at using load_data (see doc/vendor), or an updated equivalent, to load in per-class parameter settings in puppet
? maybe add an activity feed so that when we adjust other deployments by adding/changing a new one, that there's a record of it
? could also get ajaxy and say "ok, if you do this, here's what's going to get modified by this" on the deployment form
D provide warning if the same host is chosen as a current Deployment that we'll be ending that deployment
D Deployments can be edited so long as they are in the future
D can delete a future Deployment, regardless of DeployedServices; should clean up deployed_services
D Instance show view should show Deployables
D Instance show view should show current deployments
D Instance show view should show future deployments
D Instance show view should show past deployments
D should be able to see services deployed to a Host
D also, should be able to generate a history of a Host wrt services
? raphael-based service graphs, host-service graphs, customer service graphs, app service graphs, instance services graphs, etc.
? raphael-based interval maps?
- should be able to deploy an "audit" somewhere -- even if it requires specifying each audit for each host by hand; so this includes monitoring, log rotation, firewall rules, backups, database backups, etc.
---
P create pool (name, description)
P validate pool name presence, uniqueness
P hosts many:many with pool (:through model: PoolMembership)
P pool menu links
P pool index view
P pool new view, pool edit view
P pool show view (includes hosts)
P hosts show view includes pool memberships
P pool show view should allow adding host, removing hosts
P host show view should allow adding to pool, removing from pool
P deployments/new has a Pool chooser (requires a Pool)
P we can provision hosts that don't exist
P we can provision hosts into a pool
P we can ask a pool to provision hosts
P hosts can have capacities, so that we can estimate whether we have enough capacity for deploying instances
P instances can have costs, so that we can determine if we have sufficient host resources to deploy instances
P should be able to determine if a pool has sufficient capacity to deploy instance(s)
P should be able to recommend hosts to provision
P [ provision would be via rubycloud, etc. ]
P when deploying, should recommend hosts for instances, but provide a way to choose another host, or to provision a new host.
P provisioning is based on capacity/cost, fwiw
P could presumably associate a floor capacity w/ a Role
P Requirement model specifies which capacities, roles, etc., will satisfy a deployment attempt
thoughts:
P should be able to specify that certain instances not reside on the same host
P should be able to specify that certain instances should reside on the same host
P should be able to specify that a service (or at least, an audit) should be on the same host as another (think log rotation, e.g.)
P may want to be able to specify that a services (or at least, an audit) should not be on the same host as another ... or maybe it's that audits have their own Requirements for deployment?
- we can mark various things (instance, app, customer, host, ?) as "dirty" on changes, and save them up for a deployment, so that we can know if our planned changes have been put into action, know when we need to make a deployment, can batch up changes, and keep the process of creating deployments separate from the data which is used to do current production deployments
- could have indexes highlight strange situations: customers with undeployed apps, hosts with no deployments, apps which are not deployed, etc.
- we can use an instance as a template; if we make a new instance we basically make the instance, set template_id (or whatever) to the template instance, then delegate things to the template; we make the template instance's parameter values be defaults (behind customer, app, instance).
- presumably we could also template an app; we could either go the same route (have a template pointer to the template app), or we could just instantiate new instances using the old app's instances as template.
- general facility to associate remote resources with a concept (maybe w/ a parser, and we can generate an RSS feed)
-> add linkages to nagios details for a host, service, etc.
-> add linkages from customers to invoicing, e.g.
-> is it possible to link from apps (or instances) to github pages?
-> is it possible to link apps / customer to trouble ticket pages?
-> what about linking to PT, e.g.
- favicon.ico, heh
- since deployment records are captured we can run a tool like puppet consistently over them without fear of introducing changes
- since deployment manifests can only change when a deployment is created or on the time-interval boundaries of the deployments, we don't have to continuously poll for changes (i.e., we could schedule and/or notify instead).
- if we associate some sort of service category with an instance (e.g., "Database hosting", "Production Web Server, Smegme", etc.) we can apply rates to service categories, use the deployment history for the Instance(s) as a log of billable service, and invoice straight from this motherfucker. Not only that, but it could be billable at the service level.
--- Questions
Q: is the difference between a service that's required by many instances on a host (e.g., postgres) but which only really causes one installation to happen, vs. services that have the same name but differ in parameters and cause multiple things to happen (e.g., setting up a postgres *database* for an app) ... is this something larry needs to concern itself with, or do we just leverage the underlying management tool? If we're using puppet, puppet can sort things out, but it's not clear that something like chef, e.g., can.