Skip to: Site menu | Main content


Welcome to PSP-Programming.com, a place for developers to get together.

Welcome to the forums. Here you can find other user tutorials as well as homebrew releases and the source code repository. You can also ask for help with your code here and post your own homebrew!

PSP-Programming.com Forums
February 08, 2012, 06:51:19 PM *
Welcome, Guest. Please login or register.
Did you miss your activation email?

Login with username, password and session length

News: Welcome to PSP-Programming.com
Home Help Search Shop Login Register
Digg This!
Pages: [1]
Print
Author Topic: [C/C++] Yet Another VRAM manager - adaptable and high performance  (Read 5308 times)
Raphael
Global Moderator
Hero Member
*

Karma: +230/-10
Offline Offline

Posts: 1431
193700.11 points

View Inventory
Send Money to Raphael


View Profile WWW
« on: February 07, 2007, 02:15:57 PM »

I just wanted to try out another memory management system for the VRAM (or more generally small amounts of RAM) than the one I used in my valloc library. There I used a double linked list, exactly as the libc malloc routine uses. This solution however has an overhead when only managing small amounts of memory (like the 2MB of VRAM). After playing around with an idea, I coded this version of the memory manager and it turned out up to three times as fast as the previous version.
It uses a fixed block subdivision of the memory area to manage, and allocates block by marking the starting block as allocated and adapting it's size variable according. The old version had to allocate memory management structures for every valloc call plus the linked list approach suffered from inlocality of memory when traversing the list to find a new free/allocated block.
The current implementation uses a fixed amount of MEM_SIZE/BLOCK_SIZE*4 bytes of system ram for the management, that is 16kb for the current implementation with blocks of 512 bytes.
As a result, the code is also much more readable and easier to understand. I hope some can learn something from it.

PRO: - very fast allocation/deallocation of blocks in VRAM (2-3 times as fast as previous valloc implementation)
       - less system ram consumption
       - more readable code
       - can easily be changed to manage any other form of address area
CON: - allocations are aligned on bigger blocks (512 currently), so more memory waste with unaligned allocations
       - no vrealloc routine implemented (could be done by doing vfree/valloc + memcpy)


Example code to use with normal GU code:
Code:
INITCODE:
gu_frontbuffer = vrelptr(valloc(FRAME_BUFFER_SIZE));
gu_backbuffer = vrelptr(valloc(FRAME_BUFFER_SIZE));
sceGuDrawBuffer(GU_PSM_8888, gu_frontbuffer, FRAME_BUFFER_WIDTH);
sceGuDispBuffer(480, 272, gu_backbuffer, FRAME_BUFFER_WIDTH);

ALLOCATING TEXTURES:
gu_texture = valloc(512*512*4);
// fill texture buffer...

sceGuTexImage(0, 512, 512, 512, gu_texture);
// draw as normal
The same code works for my old valloc library btw. Also you don't need to deallocate the frontbuffer and backbuffer normally, since they should stay static all over your program and allocations are lost when the program exits anyway.

EDIT: Fixed vram.c file to not use undefined u32 type.
« Last Edit: February 24, 2007, 10:29:11 AM by Raphael » Logged

Don't push the river, it flows.
http://wordpress.fx-world.org - my devblog
http://wiki.fx-world.org - VFPU documentation wiki
http://www.homebrew-illuminati.co.uk - serious homebrew development for all platforms
Alexander Berl
"A good mod is a combination playground monitor, priest, big brother/sister, psychiatrist, professor and more."


jono
C/C++ Developer
Full Member
*

Karma: +23/-1
Offline Offline

Posts: 210
282.59 points

View Inventory
Send Money to jono


View Profile WWW
« Reply #1 on: February 07, 2007, 03:53:56 PM »

This is really interesting, I believe I can learn alot from this source.

Thanks Rapheal.
Logged

Good grief
Zettablade
C/C++ Developer
Full Member
*

Karma: +5/-8
Offline Offline

Posts: 108
2041.97 points

View Inventory
Send Money to Zettablade

Mudkip > You


View Profile
« Reply #2 on: February 23, 2007, 08:11:17 PM »

Heh, pwnage.
Logged

jsharrad
Developer
Global Moderator
Hero Member
*

Karma: +44/-1
Offline Offline

Posts: 613
1170.44 points

View Inventory
Send Money to jsharrad

Yarr!


View Profile WWW
« Reply #3 on: February 24, 2007, 10:03:09 AM »

I had to add #include <psptypes.h> to it to get it to compile, it was complaining about your use of u32.

Quote
vram.c: In function 'vrelptr':
vram.c:80: error: 'u32' undeclared (first use in this function)
vram.c:80: error: (Each undeclared identifier is reported only once
vram.c:80: error: for each function it appears in.)
vram.c:80: error: syntax error before 'ptr'
vram.c: In function 'vabsptr':
vram.c:85: error: 'u32' undeclared (first use in this function)
vram.c:85: error: syntax error before 'ptr'
make: *** [vram.o] Error 1
Logged

MinerPSP Coder
MinerPSP Website
Raphael
Global Moderator
Hero Member
*

Karma: +230/-10
Offline Offline

Posts: 1431
193700.11 points

View Inventory
Send Money to Raphael


View Profile WWW
« Reply #4 on: February 24, 2007, 10:24:13 AM »

You're right. Totally missed that, when I added those two functions for release (c&p from my valloc.c version) :/ I'll update the files asap. Thanks for the information
Logged

Don't push the river, it flows.
http://wordpress.fx-world.org - my devblog
http://wiki.fx-world.org - VFPU documentation wiki
http://www.homebrew-illuminati.co.uk - serious homebrew development for all platforms
Alexander Berl
"A good mod is a combination playground monitor, priest, big brother/sister, psychiatrist, professor and more."
smariobros
Newbie
*

Karma: +5/-0
Offline Offline

Posts: 46
2796.95 points

View Inventory
Send Money to smariobros

View Profile
« Reply #5 on: September 03, 2007, 04:41:02 AM »

Hello there,

First of all congratulations, your code is well done.  Razz
I'm working in a multiplataform c++ core to make games for windows, linux, psp, so
I have a ram management, file management, subsystem management, ... and I have
now to make a transparent way of getting textures to work, so I think I'll need to make
a vram management like LRU to compensate low vram and swap most used textures from
ram and vram fast. Here I have some doubts, can you help me?
1 - textures in vram is much fast than textures in ram? there's an automatic penality in the psp
when using textues from ram?
2 - read from vram is much slow? because if yes, I'll need to hava a copy of the textures in the ram too  Confused
3 - can I addapt your idea to make my vram management?  Mr. Green

Thanks, and sorry for my bad english ( I'm from Brazil )

Mario
Logged
Raphael
Global Moderator
Hero Member
*

Karma: +230/-10
Offline Offline

Posts: 1431
193700.11 points

View Inventory
Send Money to Raphael


View Profile WWW
« Reply #6 on: September 03, 2007, 06:12:46 AM »

1 - Yes. Textures from VRAM have about ten times the texel bandwidth of textures in sysram.
2 - According to the benches from flatmush, reads from VRAM are even faster than reads from sys ram. Only writes are somehwat slower. The PSP is much different in design than PCs with external graphics chips that have very slow CPU -> VRAM buses (compared to CPU -> RAM).
3 - Yes you can, as long as you abide the license (GPL in this case). My older VRAM manager is released under public domain, so you can use the code from there much more openly. In special cases I'm also willing to provide a less strict license for this version of my VRAM manager too.
Alternatively, you can just use this library as is as the underlying layer of your texture manager without changing the code.
Logged

Don't push the river, it flows.
http://wordpress.fx-world.org - my devblog
http://wiki.fx-world.org - VFPU documentation wiki
http://www.homebrew-illuminati.co.uk - serious homebrew development for all platforms
Alexander Berl
"A good mod is a combination playground monitor, priest, big brother/sister, psychiatrist, professor and more."
smariobros
Newbie
*

Karma: +5/-0
Offline Offline

Posts: 46
2796.95 points

View Inventory
Send Money to smariobros

View Profile
« Reply #7 on: September 03, 2007, 06:38:36 AM »

thanks man,

I'll use your code as an underlying layer with some modifications in the treatment of the header to make it some more fast XD
1 - you can use a index var with initial value of 0, and increment it to access next free __mem_blocks[] slot. This way you don't need to have and check next/previous bits, saving 30 bits
2 - using the technique above, you can return only indexes and use compression and coalescence in the memory (maybe converting the index to abs address in the proper function )

that's all for now

thanks,

Mario
Logged
Raphael
Global Moderator
Hero Member
*

Karma: +230/-10
Offline Offline

Posts: 1431
193700.11 points

View Inventory
Send Money to Raphael


View Profile WWW
« Reply #8 on: September 03, 2007, 07:56:36 AM »

1 - That only works as long as you don't free some memory in between. In that case NEED the prev/next bits to be able to merge blocks together properly without having to traverse the whole list again.
2 - Yeah, though that's normally something the texture manager has to care about, not the memory manager.
Also, you have the problem that as soon as you correlate the index to the real address, you cannot change one without invalidating the other - and else, you have a lot of hazzle when trying to access memory that is referenced by a unrelated index.
Well, unless you don't need direct access to that memory address (texture manager again, which only accesses that memory when a new texture is uploaded, and texture changes are done internally).
That's also the reason why I did not implement a defragmentation routine into the memory manager.

Good luck with your project.
Logged

Don't push the river, it flows.
http://wordpress.fx-world.org - my devblog
http://wiki.fx-world.org - VFPU documentation wiki
http://www.homebrew-illuminati.co.uk - serious homebrew development for all platforms
Alexander Berl
"A good mod is a combination playground monitor, priest, big brother/sister, psychiatrist, professor and more."
smariobros
Newbie
*

Karma: +5/-0
Offline Offline

Posts: 46
2796.95 points

View Inventory
Send Money to smariobros

View Profile
« Reply #9 on: September 03, 2007, 08:42:13 AM »

Hmm, I understood your point of view about what I said.

The idea I have is that you need only the flag used/unused, the number of blocks that the index use and the ID that you return.
If this is for use only on PSP, you can use 1 bit for used/unused, 12 bits for number of blocks used if the blocks have 512 Bytes ( 0 ~ 4095 = 4K * 512B = 2M ) and 12 bits to mark the ID of the block. The prev/next are index-1 and index+1, and this is static. the var that hold the num blocks useds keep incrementing to the end, and other var hold the first block free between blocks useds to accelerate the alloc func. But this turned into a different implementation at all xD

This method apear to be slow compared with yours because the conversion from index to addr .... but I think it can abstract the memory alocation and use total memory if I compress it.
I'll implement this method too and I post here the results.

Thanks again Raphael, it's good discuss some ideas with others people
Again sorry if I appear to be rude or something, my english is bad xP

Mario

Logged
Flatmush
Has a normal user title
Administrator
Hero Member
*

Karma: +84/-26
Offline Offline

Posts: 1046
12906.27 points

View Inventory
Send Money to Flatmush

The Omniscient One


View Profile WWW
« Reply #10 on: September 03, 2007, 11:16:00 AM »

I have found what I believe to be the best viable solution to the problem that allows for 30 bit accuracy of blocklengths.

Mark the beginning of a block and it's length, and mark the last block before the next block with a pointer to it's starting block. I have implemented it and it's quite fast, it's just that I'm really bad at explaining it so if anyones interested I will PM them the code.
Logged

Firmware History: 2.60 -> 2.71 -> 1.50 -> 3.03oe-c

I am nerdier than 66% of all people. Are you nerdier? Click here to find out!I am 62% loser. What about you? Click here to find out!NerdTests.com User Test: The Can I Run A Business Test.

Hehe I'm a "Hero Member" because I bought posts back when they were in the shop.

Creator of FlatEditPSP, funcLib and flAstro
Pages: [1]
Print
Jump to:  

Powered by MySQL Powered by PHP Powered by SMF 1.1.11 | SMF © 2006-2009, Simple Machines LLC Valid XHTML 1.0! Valid CSS!
Page created in 0.24 seconds with 34 queries.
Sister Sites: Guitar Hero 4   BrokeniTouch.com