i got to send in my trained model coming thursday but training my model takes so long i know it wont be ready before i have to show it currently it executs an action every 0.5 seconds is there any way to speed this up massively?
You're gonna have to be way more specific. How on earth am I supposed to understand what you're doing? Like, "action"? What do you mean? What are you actually doing? What language, what sort of model, ... The way you're asking it I bet you won't get any useful answers.
You'd need to provide a good bit more information, such as what you've written the programming language in, the libraries used, etc. to get a valid answer useful to you. However, something that I've found immensely helpful in shortening script execution times and resource usage is understanding Big-O Notation. The link is to a fairly informative 3-part tutorial series to understand it. The author wrote it using python for the examples, but they apply to any language as long you learn the concepts and change out the syntax.
right my bad so for school i have to train a model with dqn to play tetris in python i use gym_tetris as the env and a classic keras model dqn agent : self.model = Sequential() self.model.add(Dense(32, input_shape=(observation_space,), activation="relu")) self.model.add(Dense(32, activation="relu")) self.model.add(Dense(self.action_space, activation="linear")) self.model.compile(loss="mse", optimizer=Adam(lr=0.001)) the problem is every action (move left, right,.... ) executed by my model currently takes about 0.5 sec (currently 1.5s and its rising) do you have any way to fix this this problem? ps: its over 5s now it hink theres a memory leak or something like that but i dont know python enough to find the problem aaand it crashed... cuz of access violation =( sigh wish my teacher actually explained how to actually do dqn
If you are using a library, maybe, maybe not. If you wrote the whole training code yourself, then probably yes. Is the computer you are using fast? The easiest way is to use faster computer. Is you disk fragmented? Can you use smaller training set? That looks like python, are you using numpy or some other library?
If you have a memory leak first check to make sure no variables are left open ended that’s a common issue I’ve seen
my pc is fast enough to run civ 5 and hearthstone at the same time without issues no memory leak it seems and i dont use training sets its dqn which means it learns from nothing and yes its python
Optimization in a nutshell: make everything you repeat a lot fast (nb. amortized analysis), avoid doing unnecessary computations f.ex. by moving the burden to memory (nb. O(n*log( n ) ) sorts) don't reserve memory without good reason (nb. bus and RAM speeds) don't do it without real need One thing of note is that f.ex. immutable classes are good for maintenance and concurrency reasons, but they result in more memory allocation... so you still have to have some idea about what you're doing. Your problem smells like a memory leak, allocating too many new objects, or use of wild uncontrolled recursion that kills your stack, because the efficiency gets worse over time.
You're still not making it easy for us. Like, how on earth do I know what "classic keras" means when I look it up and it literally says on the Wikipedia page "numerous implementations of commonly used neural-network building blocks". The simplest fix is just reducing the size of the network you're using. Or lower the batch size. From what I've looked up just now, yes, Q-learning seems to fall in the online learning category. But of course our original poster doesn't even consider explicitly mentioning what type of problem it is because "dqn" and "keras" are supposed to be enough. Like, he's shooting himself in the foot. That's a segmentation error, weird. Memory management usually isn't an issue in Python, so it's keras. Looking it up sometimes keras doesn't play nice with e.g. the GPU if that's what you use for parallelization. A full reinstall following every step strictly can be a good idea in this case. Or it could be the batch size, that might cause issues too. I would try two things: look up possible keras python memory issues, and lower parameters so it's easier to run to see if your code works properly at all on a smaller scale.
well thats because i dont know the problem myself ive got 0 clue what im doing i dont even understand what exactly happens at the creation of the model itself i guess thx for the tips
That's not too bad time wise. You have a few options: 1. Get more CPUs since I'm going to assume you're using keras w/ a tensorflow backend without a GPU. 2. Ideally, run tensorflow-gpu backend instead if you have one or access to one. 3. Probably too much work, but it would be basically constant time lookup. You could run random simulations and train your data offline, collect the data, then parse that into some lookup table/db/whatever since I'm going to assume the game space isn't astronomical (tetris, right). Then when the game is played, you could lookup rather than use your model.
well i downloaded another project which is quite similar to mine and in the time mine does half a game it does 2000 runs so i know the issue is on my side i just have to figure out that what the problem is
I suggest you look further into how you're working through adding training. There's so many examples to look at and see how others train, even using live event handling. Personally, though, I'd grab some compilations that others used and built up to start with. At least then you'd find out how to properly test the weights. Just a simple Google search for "machine learning tetris sets python gym_tetris" came up with two interesting links I think you should check first: one and two. Of course, this one is also rather humorous.
Ok, are you training it to learn some game? You probably have some really inefficient data structure to pick actions from. Or are you running some sort of search every time? If that's what you do, you have to improve that. Either make search queue up actions, or recalculate them all and just use them. See if your representation is good. Is it large? Do you go to swap/disk every time you do it? Also access violation should not happen in python. It means something is very wrong or missing. It really does sound like there is a memory leak or some very bad form of caching happening. Did you get all latest versions of all components you need? Also, perhaps you should consider running it on some cloud service, like aws. I think all major companies provide some for of free services with tensorflow enabled. Those environments will definitely be faster than yours and set up correctly It is also possible that whatever library you are using is faulty/incomplete. If you have a choice, maybe change it. Also, these questions should be asked of your ta and professor. Especially if you can't even formulate the problem properly