Skip to content

Commit

Permalink
Finish a lot of the docs
Browse files Browse the repository at this point in the history
  • Loading branch information
billdueber committed Sep 27, 2024
1 parent fc3efa3 commit d63a7e6
Show file tree
Hide file tree
Showing 13 changed files with 339 additions and 385 deletions.
81 changes: 45 additions & 36 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,54 +2,59 @@

A new(-ish, these days) discovery system for the Middle English Dictionary.

## Repositories
Confusingly, there are three separate repositories:
* [dromedary](https://github.com/mlibrary/dromedary), this repo, is the
**Rails application**. The name was given the project
when someone decided we should start naming project with nonsense words.
* [middle_english_dictionary](https://github.com/mlibrary/middle_english_dictionary) is
not, as one might expect, the Middle English Dictionary code. Instead,
it's the code that pulls out indexable data from each little
XML file, inserts things like links to the OED and DOE, and serves
as the basis for building solr documents.
* [middle-english-argocd](https://github.com/mlibrary/middle-english-argocd)(_private_) is the argocd setup which deals with environment
variables and secrets, and serves to push the application to production. It also
has a small-but-valid .zip file under the `sample_data` directory.

* **Public Repository for app**: https://github.com/mlibrary/dromedary
* **Private repository for argo build**: https://github.com/mlibrary/middle-english-argocd
## Documentation
* [Setting up a development environment](docs/setting_up.md) runs through
how to get the docker-compose-based dev environment off the ground and
index some data into its local solr.
* [Overview of indexing](docs/indexing.md) talks about what the indexing
process does, where the important files are, and what code might be
interesting.
* [Configuration](docs/configuration.md) does a _very_ brief run through
the important ENV values. In general, the [compose.yml](compose.yml) file,
the argocd repository, and especially [lib/dromedary/services.rb](lib/dromedary/services.rb)
will be the best place to see what values are available to change. _Don't do that
unless you know what you're doing, though_.
* [Solr setup](docs/solr_setup.md) looks at the interesting bits of the
solr configuration, in particular the suggesters (for autocomplete).
* [Tour of the application code](docs/application_code.md) is a quick look at how
the MED differs from a stock Rails application.
* [Deployment to production](docs/deployment.md) shows the steps for building the
correct image and getting it running on the production cluster, as well as
how to roll back if something went wrong.

If you need access to the private repo, get in touch with A&E.
### Access links
* **Public-facing application**: https://quod.lib.umich.edu/m/middle-english-dictionary/
* **"Preview" application with exposed Admin panel**: https://preview.med.lib.umich.edu/m/middle-english-dictionary/admin

## Setup your development environment
### About upgrading

The development environment is set up to use `docker compose` to manage
the rails application, solr, and zookeeper (used to manage solr).
This repo currently runs on Ruby 2.x and Blacklight 5.x, and there are no plans
to upgrade either.

To build and start running the application:

```shell
docker compose build
docker compose up -d
```

### Test access to the application and solr

* **Error page**: http://localhost:3000/. Don't let that confuse you.
* **Splash page**: http://localhost:3000/m/middle-english-dictionary.
* **Solr admin**:
* **url**: http://localhost:9172
* **username**: solr
* **password** SolrRocks

**NOTE** At this point you can't do any searches, because there's no data in the
solr yet.
<pre>


### Indexing a file locally

NOTE: You can't index a file locally through the administration interface -- that's
hooked directly to an AWS bucket, and won't affect your local install at all
(it'll replace the `preview` solr data).
</pre>

* Make sure
<hr>
<hr>

```shell
docker compose run app -- bin/index_new_file.rb <path>/<to file>.zip
```

Give it however long it takes (a couple minutes for a minimal file,
and around an hour for a full file).

You'll know it's done when the


# OLD STUFF
Expand Down Expand Up @@ -107,10 +112,14 @@ docker-compose up -d


Verify the application is running http://localhost:3000/

## Bring it all down then back up again
```shell
docker-compose down
```
```shell
docker-compose up -d
```

Note that there's no data in it yet, so lots of actions will throw errors. It's time
to index some data.
7 changes: 6 additions & 1 deletion app/controllers/catalog_controller.rb
Original file line number Diff line number Diff line change
Expand Up @@ -187,7 +187,12 @@ class CatalogController < ApplicationController
# solr request handler? The one set in config[:default_solr_parameters][:qt],
# since we aren't specifying it otherwise.

# config.add_search_field 'Keywords', label: 'Everything'

######################### WHAT ARE THE DOLLAR-SIGN VARIABLES??? ############
# These are sent to solr as the actual string (e.g., solr gets "$everything_qf").
# They are then expanded by solr withing the solr process based on configuration
# files there. See, e.g., solr/dromedary/conf/solrconfig_med/entry_searches/everything_search.xml
# for an example.

config.add_search_field("anywhere", label: "Entire entry") do |field|
field.qt = "/search"
Expand Down
9 changes: 9 additions & 0 deletions compose.yml
Original file line number Diff line number Diff line change
Expand Up @@ -43,6 +43,15 @@ services:
- -b
- 0.0.0.0

db:
image: postgres:12-alpine
ports:
- "5432:5432"
environment:
- POSTGRES_PASSWORD=postgres
- PGDATA=/var/lib/postgresql/data/db
volumes:
- db:/var/lib/postgresql/data

solr:
build: solr/.
Expand Down
52 changes: 52 additions & 0 deletions docs/application_code.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,52 @@
# Tour of the application code

In general, the MED is a "normal" Blacklight (v5) application, with a lot
of added stuff.

Like most Blacklight apps, the heart of the configuration is
in the [Catalog Controller](../app/controllers/catalog_controller.rb),
specifying the search and facet fields. This is repeated for the
other controllers, and they're fairly straightforward so long as
you're willing to take on faith that Rails will find things at the
right time.


## Models
The files in [models](../app/models) are unusual for a Rails app in that we're not using
a backing database, and thus not deriving them from ActiveRecord.
The directly is empty of anything interesting.

The _actual_ objects to represent each of the many, many layers of a dictionary
entry are actually defined in the [middle_english_dictionary](https://github.com/mlibrary/middle_english_dictionary)
repository. The nomenclature can be confusing, since it's derived from
the jargon of the field, but none of the objects are particularly complex.

## Presenters

The meat of the interface is actually built into the presenters.
[quotes presenter](../app/presenters/quotes/index_presenter.rb)
is indicative of the setup, pulling in lots of XSLT and getting values
out of the document with XSLT transforms or by querying the
`Nokogiri::XML` document directly.

## lib/med_installer

...is all indexing code, and isn't used in the rest of the application. See
[indexing.md](indexing.md) for a brief overview.

## /solr

...has all the solr configuration in it, as well as the
[Dockerfile](../solr/Dockerfile) and the [container initialization](../solr/container/solr_init.sh)
code. See [solr.md](solr.md) for more.

## Everything else

...is basically just normal views and utility objects
(most importantly the [services.rb](../lib/dromedary/services.rb) file).






84 changes: 25 additions & 59 deletions docs/autocomplete_setup.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,90 +4,58 @@ The blacklight autocomplete setup is pretty brittle and needs some mucking
about with to get it to work with multiple different input boxes that
target different solr autocomplete endpoints.

## Three things that need doing
## Three things that needed doing
* Add a new autocomplete handler in the solr config
* Make changes to `config/autocomplete.yml`
* Add the option to the dropdown where the users determines what to search

If you're adding a whole new autocomplete input field in the HTML,
you'll also need to:
* Create the input box with
* Make changes to `autocomplete.js.erb`

* Add javascript code to trigger autocomplete
to the dropdown where the users determine what to search

## Configure the new autocomplete handler in the solr config

Suggesters live in `solr/med/conf/solrconfig_med/suggesters`. You
can pattern a new one off of the ones in there.
Suggesters live in
[solr/dromedary/conf/solrconfig_med/suggesters](../solr/dromedary/conf/solrconfig_med/suggesters).
You can pattern a new one off of the ones in there.

Of course, if you're using an existing handler in a new context (say,
you want autocomplete for headwords again in an advanced search box), you
can just use the handler that's already been defined and skip ahead to
making and configuring the new search input field and dropdown.

Things to note:
* The name in `<str name=-name...>` in the top section is any name you
make up to identify this suggester.
* ... but the `suggest.dictionary` in the bottom section *must* match that name
* ... and same with the `<arr name="components"...>` in the bottom

* You can pick any names for the suggester handler, but it
must be used _identically_ in three places:
* The top `<str name=-name...>`
* The name of the `suggest.dictionary`
* The reference in the `<arr name="components"...>` block.
* The `field` is the name of the field you're basing this on. It *must*
be a stored field!
be a stored field! The code we have now builds a special field for this
instead of trying to force an existing field to work.
* `"suggestAnalyzerFieldType"` is probably the same fieldType used for the
field you're indexing in this typeahead field, but if you have the know-how
and think it should be different, go for it.


## Add new suggester to autocomplete in `config/autocomplete.yml`

```yaml
# Autocomplete setup.
# The format is:
# search_name:
# solr_endpoint: path_to_solr_handler
# search_component_name: mySuggester
#
# The search_name is the name given the search in the
# `config.add_search_field(name, ...)` in catalog_controller
#
# The "keyword" config mirrors the default blacklight setup

default: &default
keyword:
solr_endpoint: path_to_solr_handler,
search_component_name: "mySuggester"
h:
solr_endpoint: headword_only_suggester
search_component_name: headword_only_suggester
hnf:
solr_endpoint: headword_and_forms_suggester
search_component_name: headword_and_forms_suggester
oed:
solr_endpoint: oed_suggester
search_component_name: oed_suggester

development:
<<: *default

test:
<<: *default

production:
<<: *default

```
## Add new suggester to the autocomplete configuration
Pattern match from another entry
and add it to [autocomplete.yml](../config/autocomplete.yml)

## Catalog controller configuration

Now load the autocomplete setup into your blacklight configuration.

```ruby

# Autocomplete on multiple fields. See config/autocomplete.yml
config.autocomplete = ActiveSupport::HashWithIndifferentAccess.new Rails.application.config_for(:autocomplete)

```

## Adding a whole new dropdown
## Adding a whole different search box

I don't expect this will happen at this point, but the knowledge may
well come in handy on other project.

### Adding a whole new dropdown

* Put a data attribute `data-autocomplete-config` on your text box
to reference which typeahead configuration should be used (e.g,
`h` or `hnf` in the config example above).
Expand Down Expand Up @@ -116,7 +84,5 @@ the correct index when a user picks, e.g., "headword only."
## Side note: this overrides blacklight code

I don't actually moneky-patch, but I do use `Module#prepend`. The code
is in `config/initializers/autcomplete_override_code.rb`
is in [autocomplete_override_code.rb](../config/initializers/autocomplete_override_code.rb).

If Blacklight ever changes the autocomplete setup to allow this sort of
thing, we'll need to re-evaluate whether these extensions are necessary.
64 changes: 64 additions & 0 deletions docs/configuration.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,64 @@
# Configuration

Configuration of this application has...grown organically. Thus, lots
of things are spread out over a few locations.

## ./config

**The normal rails `config` directory** has all the "normal" stuff
in it, including the [`routes.rb`](../config/routes.rb) file which can be inspected to
understand what things are exposed.

Additions to the normal rails/blacklight stuff include:
* [autocomplete.yml](../config/autocomplete.yml) configures the exposed
names and the solr targets for each autocomplete field's `suggester` handler.
* [autocomplete_overrid_code.rb](../config/initializers/autocomplete_override_code.rb),
which provides extra code (via `module prepend`) to deal with the fact
that we're running multiple suggesters.
* [load_local_config.rb](../config/load_local_config.rb) had so much
stripped away that it's now mostly utility code to reload the
`hyp_to_bibid` data from the current solr and extract data from
the name of the underlying collection.

## Controllers

The [CatalogController](../app/controllers/catalog_controller.rb) is the heart
of any Blacklight application, specifying how to talk to solr, what fields to
expose as searchable, etc. We have controllers for all the different aspects
of the site (e.g., bibliography, quotes), so make sure you're looking at
the right one.

## Mystery Solr Configuration in the controllers

The solr configuration in the controllers includes variables that look like
`$everything_qf` where normally you'd have a list of fields. The decision was
made to actually store this configuration in _solr_, so sending that
magic reference will cause solr (on its end) to use the values defined in
the XML up there. See ([headword_and_forms_search.xml](../solr/dromedary/conf/solrconfig_med/entry_searches/headword_and_forms_search.xml))
for a representative sample.


## The Dromedary::Services object

The [services object](../lib/dromedary/services.rb) is really the heart of
all the configuration. Every effort has been made to push everything
though it, as opposed to directly using ENV variables and such.

Instead of just being a passthrough for the environment, though, the
Services object also includes useful things derived from those
variables, including things like a connection object for the
configured solr.

Essentially everything you need to understand how the application is
influence "from the outside" (e.g., the argocd config) is in this file.

## AnnoyingUtilities

The [annoying_utilities.rb](../lib/annoying_utilities.rb) file once did
all the little, annoying substitutions that were necessary to run the application
on two relatively-different staging and production platforms. Now that
it's all container-based, all of that code has been ripped out and
replaced with thin wrappers around `Dromedary::Services`. Mentioned
here because it's still in the code in some places and might be
confusing.

Loading

0 comments on commit d63a7e6

Please sign in to comment.