Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Chaining networks and extracting intermediate calculations in Neural.Graph #487

Open
mreppen opened this issue Jan 19, 2020 · 4 comments
Open
Assignees
Labels

Comments

@mreppen
Copy link
Contributor

@mreppen mreppen commented Jan 19, 2020

Short background: I have two inputs x₁ and x₂ and a label y = f₁(x₁) + f₂(x₂). The functions f₁ and f₂ are unknown and approximated by f1nn and f2nn. So far so good, and I can train a big network ynn containing the structure for f1nn and f2nn. However, the predicted value y is not of interest, but instead I want f₁ and f₂.

In the current form of the Graph module, I couldn't figure out how to do this in a convenient way. At first I extracted these networks by copying neurons, but then moved on to constructing networks for f₁ and f₂ and then create ynn using the neurons from f1nn and f2nn (without copying), which, by mutability, will be trained in the training of y. Prediction is then accomplished by Graph.model f1nn x.

In the applications of interest to me, this is a quite common problem, so some sort of support for this would be great, unless my use case is very niche.

I see three ways to accomplish this.

  1. Like in the crude code below, just take existing neurons from a given network and chain them onto another network.
  2. Create a Network or Subnetwork node which does essentially the same thing. This could be more pure, as one does not have to rely on mutability like in 1. Instead, when chaining, everything could be copied, provided the Network node has an interface for extracting a copy to use for prediction later. This approach also makes the network printout a bit cleaner, as there are fewer nodes.
  3. For any given node in a network, make a network copy with all nodes it depends on, i.e., all prev nodes.

In my opinion, the second is the cleanest, whereas the third is the most flexible. If anything is of interest, I would gladly work on a PR.

open Graph
let chain_network_ nn in_node =
  let in_nn = get_network in_node in
  let add_node parents child =
    in_nn.size <- in_nn.size + 1;
    connect_to_parents parents child;
    in_nn.topo <- Array.append in_nn.topo [| child |]
  in
  let find_node name topo =
    let x = Owl_utils_array.filter (fun n -> n.name = name) topo in
    x.(0)
  in
  let new_nodes = Array.map
    (fun node : node ->
      let neuron = node.neuron in
      make_node ~name:(nn.nnid ^ node.name) ~train:node.train [||] [||] neuron None in_nn)
    nn.topo
  in
  Array.iter2
    (fun node node' ->
      match node.neuron with
      | Input _ -> ()
      | _ ->
        node'.prev <- Array.map
          (fun n -> match n.neuron with
          | Input _ -> in_node
          | _ -> find_node (nn.nnid ^ n.name) new_nodes)
          node.prev;
        node'.next <- Array.map (fun n -> find_node (nn.nnid ^ n.name) new_nodes) node.next;
        add_node node'.prev node')
    nn.topo
    new_nodes;
  let output_name = nn.nnid ^ (Array.get (get_outputs nn) 0).name in
  find_node output_name new_nodes
@jzstark
Copy link
Contributor

@jzstark jzstark commented Jan 22, 2020

If I understand correctly, the problem is to get the output of any layer in the network during the inferencing phase. I met similar issues before and have to manually accumulating the output of each layer:

let run' topo x =
  let last_node_output = ref (F 0.) in
  Array.iteri (fun i n ->
    let input  = if i = 0 then x else !last_node_output in 
    let output = run [|input|] n.neuron in
    last_node_output := output;
  ) topo;
  !last_node_output

In Keras that can be done simply by using model.layers[index].output. It would be great to achieve this in Owl. So please feel free to PR your solution.

@mreppen
Copy link
Contributor Author

@mreppen mreppen commented Jan 24, 2020

I have a proof of concept for point 3 that builds a new network based on only the nodes that a given node depends on (all, not just the previous). It takes an optional argument of node names and replaces those with inputs. This way, it allows starting the prediction at any node.

It should be possible to reuse this code for only doing prediction without a new network if that is more desirable.

Before I send a [WIP] PR or go into the details, let me ask:

  • Is it correct that network.topo has to be ordered?
  • Is it correct that network.roots may be arbitrarily ordered?
@jzstark
Copy link
Contributor

@jzstark jzstark commented Jan 24, 2020

Yes the nodes in network.topo has to be topologically sorted. Though I don't quite get if there is any ordering issue in roots. It's the inputs of the graph, so it only depends on user's choice. If you are using the inputs layer and decide the input should be ndarray A and then B, or the other way around, both orders are ok.

@ryanrhymes
Copy link
Member

@ryanrhymes ryanrhymes commented Jan 24, 2020

ordering is not required for roots afaik

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
owl development
  
Awaiting triage
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
3 participants
You can’t perform that action at this time.